Zhipu / Z.ai's flagship coding-first agentic model and successor to GLM-5.1 — a ~753B MoE with DeepSeek Sparse Attention (DSA) and the highest-scoring open model on the AA Intelligence Index v4.1 (51, #1 of 92). Launched 2026-06-13 on the GLM Coding Plan tiers; full MIT open weights, standalone API, and benchmarks followed 2026-06-16. First GLM with a "solid" 1M-token context (131,072 max output), with High/Max thinking-effort presets (Max recommended for coding).

Architecture: introduces IndexShare (arXiv 2603.12201) — reusing one lightweight sparse-attention indexer across every four layers to cut per-token FLOPs 2.9× at 1M context — plus an improved MTP layer that lifts speculative-decoding acceptance length up to 20%. Self-reported benchmarks: AIME 2026 99.2, GPQA-Diamond 91.2, IMOAnswerBench 91.0, SWE-bench Pro 62.1, Terminal-Bench 2.1 82.7 (best harness), DeepSWE 46.2, FrontierSWE 74.4 — a large jump over GLM-5.1 and competitive with Claude Opus 4.8 / GPT-5.5 / Gemini 3.1 Pro at far lower cost. Pure MIT license, no regional limits.

Model Details

Architecture MOE
Parameters 753B
Context window 1,000,000
AA Intelligence 51
License MIT

Benchmark Scores

Benchmark Score Mode
AIME 2026 99.2
GPQA-Diamond 91.2
IMOAnswerBench 91.0
SWE-bench Pro 62.1
Terminal-Bench 2.1 82.7
DeepSWE 46.2
FrontierSWE 74.4
moeopen-weightfrontiercodingagenticreasoning

Related