Landmark 1-trillion parameter MoE model family (32B active) released as open-weight. 384 experts with 8 activated per token. Pre-trained on 15.5T tokens on H800 GPUs using the MuonClip optimizer, with context extended from 4K to 128K via YaRN. Focused on agentic intelligence and tool use. Evolved through Thinking and Instruct-0905 variants.

Outputs 4

Kimi K2 Instruct

model
Architecture MOE
Parameters 1T
Active params 32B

Kimi K2 Tech Report: Open Agentic Intelligence

paper

arXiv: 2507.20534

Kimi-K2-Instruct-0905

model

Updated K2 with expanded 256k context window and improved coding performance.

Architecture MOE
Parameters 1T
Active params 32B
Context window 256,000

Kimi K2 Thinking

model

Reasoning-heavy "thinking agent" capable of hundreds of sequential tool calls.

moefrontieropen-weightagenticreasoning