Qwen3

Third generation of the Qwen model family. Introduced a massive 36-trillion token pretraining dataset. Includes dense, MoE, multimodal, speech, and reasoning models.

Blog Post GitHub Artificial Analysis

Outputs 9

model

Dense: 0.6B, 1.7B, 4B, 8B, 14B, 32B. MoE: 30B-A3B and flagship 235B-A22B.

Blog Post GitHub

AA Intelligence 20

Variants

Name	Parameters	Notes
Qwen3-0.6B	0.6B	—
Qwen3-1.7B	1.7B	—
Qwen3-4B	4B	—
Qwen3-8B	8B	—
Qwen3-14B	14B	—
Qwen3-32B	32B	—
Qwen3-30B-A3B	30B	MoE
Qwen3-235B-A22B	235B	MoE flagship

Qwen3 Technical Report

paper 2025-05-14

Hybrid reasoning architecture. 6 dense models (0.6B, 1.7B, 4B, 8B, 14B, 32B) and 2 MoE models (30B-A3B, 235B-A22B).

Paper (arXiv)

Citations 40

arXiv HTML

Qwen3-Max

model 2025-09-05

Alibaba's largest model at over 1 trillion parameters. Closed-source MoE architecture served via Alibaba Cloud Model Studio.

Announcement

Architecture MOE

Parameters 1T+

Parameters (est.) ~ 685B

Qwen3-Next

model 2025-09-10

80B-A3B base model with Instruct and Thinking fine-tuned variants.

Blog Post GitHub

Architecture MOE

Parameters 80B

Active params 3B

Variants

Name	Parameters	Notes
Qwen3-Next-80B-A3B-Base	80B	—
Qwen3-Next-80B-A3B-Instruct	80B	—
Qwen3-Next-80B-A3B-Thinking	80B	—

Qwen3-Omni

model 2025-09-22

Unified multimodal model (text, image, audio, video). Thinker-Talker MoE architecture. SOTA on 32/36 audio benchmarks.

Paper (arXiv)

Architecture MOE

Citations 40

arXiv HTML

Qwen3-VL Technical Report

paper 2025-11-26

Dense (2B-32B) and MoE (30B-A3B, 235B-A22B) vision-language models with 256K native context.

Paper (arXiv)

Citations 40

arXiv HTML

Qwen3-TTS

model 2026-01-22

Multilingual TTS with 3-second voice cloning. Trained on 5M+ hours, 10 languages.

Paper (arXiv)

Citations 40

arXiv HTML

Qwen3-Max-Thinking

model 2026-01-27

API-only reasoning model.

Blog Post

Architecture MOE

Parameter count unconfirmed: estimated 1T total, 22B active

Qwen3-ASR & ForcedAligner

model 2026-01-29

Speech recognition for 52 languages. Includes forced alignment model. SOTA among open-source ASR.

Paper (arXiv)

Citations 40

arXiv HTML

open-weightmoenlpmultimodalreasoning