GLM-5V-Turbo
modelZ.ai's first native multimodal agent foundation model, built for vision-based coding and agentic workflows. Natively processes images, videos, design drafts, and complex document layouts as primary training data. 744B MoE with 40B active parameters per token, trained on 28.5T tokens. 200K context window, 131K max output.
Optimized for the perceive → plan → execute loop in autonomous environments. Deeply integrated with OpenClaw and Claude Code workflows. Outperforms Claude Opus 4.5 on agentic browsing (BrowseComp) while trailing by 3.1 points on SWE-bench coding. AA Intelligence Index: 43 (reasoning).
Model Details
Architecture MOE
Parameters 744B
Active params 40B
Context window 202,752