Open Source LLMs

Mistral 7B
Mistral AI 7B params Apache 2.0

High-performance 7B model outperforming Llama 2 13B on all benchmarks. Optimized for efficiency with grouped-query attention and sliding window attention.

General Purpose Instruction Following Reasoning
Qwen 2.5
Alibaba 0.5B-72B params Apache 2.0

Multilingual model family with strong performance across 29+ languages. Excellent for international deployments with competitive English benchmarks.

Multilingual Scalable Code Generation
Phi-3
Microsoft 3.8B-14B params MIT

Small language model optimized for edge deployment. Achieves GPT-3.5 level performance at a fraction of the size, ideal for on-device AI.

Edge/Mobile Efficient Reasoning
DeepSeek V3
DeepSeek 671B MoE MIT

Mixture-of-Experts model with 671B total parameters (37B active). State-of-the-art open-source performance on coding and reasoning benchmarks.

MoE Architecture Coding Math
Llama 3.2
Meta 1B-90B params Llama License

Meta's latest open model family with vision capabilities. Strong general-purpose performance with extensive fine-tuning ecosystem support.

General Purpose Vision Fine-tunable
GLM-4.6
Zhipu AI (Z.ai) 355B MoE (32B active) MIT

Latest GLM model with 200K context window and exceptional coding performance. Outperforms competitors in 74 real-world coding tests with 30% better token efficiency.

Coding Excellence 200K Context Reasoning
GLM-4.5
Zhipu AI (Z.ai) 355B/106B MoE MIT

Open-source MoE model unifying reasoning, coding, and agentic capabilities. Trained on 22T tokens including 7T for code/reasoning. Ranks 3rd on combined benchmarks.

Agentic AI Reasoning 128K Context
GLM-4-9B
Zhipu AI (Z.ai) 9B params MIT

Compact open-source model outperforming Llama-3-8B. Supports 128K context, web browsing, code execution, and function calling across 26 languages.

Multilingual Tool Calling Efficient

RAG Models

CLaRa 7B
Apple 7B params Apache 2.0

Unified RAG framework with 16x-128x semantic document compression. End-to-end differentiable retrieval and generation in continuous latent space.

Document Compression E2E RAG Multi-hop QA
ColBERT v2
Stanford 110M params MIT

Late interaction retrieval model with token-level matching. Enables efficient semantic search with pre-computed document representations.

Dense Retrieval Late Interaction Efficient

Coding Models

Code Llama
Meta 7B-70B params Llama License

Code-specialized Llama model with infilling capabilities. Supports multiple programming languages with strong performance on HumanEval.

Code Generation Infilling Multi-language
StarCoder 2
BigCode 3B-15B params BigCode OpenRAIL-M

Trained on The Stack v2 with 600+ programming languages. Optimized for code completion with 16K context window and fill-in-the-middle support.

Code Completion 600+ Languages Long Context
DeepSeek Coder
DeepSeek 1.3B-33B params MIT

Code-focused model achieving GPT-4 level on coding benchmarks. Strong performance on HumanEval, MBPP, and DS-1000 with efficient architecture.

High Performance Multi-language Efficient

Vision Models

LLaVA 1.6
UW-Madison/Microsoft 7B-34B params Apache 2.0

Large Language and Vision Assistant with strong visual reasoning. Supports high-resolution images and achieves competitive performance on vision-language benchmarks.

Vision-Language Visual QA Image Understanding
Florence-2
Microsoft 0.23B-0.77B params MIT

Unified vision foundation model for captioning, detection, segmentation, and OCR. Compact size with strong zero-shot transfer capabilities.

Multi-task Vision Detection OCR
GLM-4.6V
Zhipu AI (Z.ai) 9B-106B params MIT

Vision-language model with native multimodal function calling. Supports 128K context (~150 pages of documents or 1 hour of video). Processes text, charts, tables, and formulas in one pass.

Tool Calling Document Understanding 128K Context

Embedding Models

BGE-M3
BAAI 567M params MIT

Multi-functionality, multi-linguality, and multi-granularity embedding model. Supports dense, sparse, and multi-vector retrieval in 100+ languages.

Multilingual Hybrid Retrieval Long Context
E5 Mistral
Microsoft 7B params MIT

LLM-based embedding model with state-of-the-art MTEB performance. Combines instruction-following with semantic similarity for flexible retrieval.

LLM Embeddings MTEB Leader Instruction-tuned

Quick Comparison

Model Category Parameters Best For License
Mistral 7B LLM 7B General purpose, efficiency Apache 2.0
Qwen 2.5 LLM 0.5B-72B Multilingual applications Apache 2.0
Phi-3 LLM 3.8B-14B Edge/mobile deployment MIT
DeepSeek V3 LLM (MoE) 671B (37B active) Coding, math, reasoning MIT
CLaRa 7B RAG 7B Document compression RAG Apache 2.0
Code Llama Coding 7B-70B Code generation, infilling Llama License
StarCoder 2 Coding 3B-15B Code completion, 600+ langs OpenRAIL-M
LLaVA 1.6 Vision 7B-34B Visual QA, image understanding Apache 2.0
GLM-4.6 LLM (MoE) 355B (32B active) Coding, 200K context MIT
GLM-4.5 LLM (MoE) 355B (32B active) Agentic AI, reasoning MIT
GLM-4-9B LLM 9B Multilingual, tool calling MIT
GLM-4.6V Vision 9B-106B Document understanding, tool calling MIT
BGE-M3 Embedding 567M Multilingual retrieval MIT

External Resources

Open LLM Leaderboard

HuggingFace benchmark rankings for open-source language models across standard evaluation tasks.

Artificial Analysis

Independent AI model comparison with quality, speed, and pricing benchmarks.

MTEB Leaderboard

Massive Text Embedding Benchmark for comparing embedding model performance.

EvalPlus Leaderboard

Code generation benchmark with rigorous evaluation on HumanEval+ and MBPP+.