100,000 complex multi-step reasoning trajectories used to train the GLM-5 "Thinking" mode.
reasoningagentictraining-data

Related

Notes

Date approximate.