353B agentic model (fine-tuned GLM-4.6) achieving 47% relative gain on Toolathlon from only 239 training samples. 148% improvement vs model trained on 66K samples. Trajectories average 85K tokens and 116 tool calls.

Model Details

Architecture DENSE
Base model glm-4.6

Paper

arXiv: 2602.02619

agenticopen-weight