Coding-focused agentic model built on Kimi K2.6 — same 1T-total / 32B-active MoE architecture (384 experts, 8 selected + 1 shared, 61 layers, MLA attention, SwiGLU), 256K context, with a 400M MoonViT vision encoder for image (and experimental video) input. Targets real-world long-horizon software engineering: stronger end-to-end task completion across complex workflows while cutting thinking-token usage by ~30% vs K2.6. Forces thinking mode on. Modified-MIT license; ships with native INT4 quantization (same method as Kimi K2-Thinking) and runs on vLLM, SGLang, and KTransformers, with an OpenAI/Anthropic-compatible API on the Moonshot platform.

Self-reported benchmarks (thinking mode, via Kimi Code CLI), vs K2.6: Kimi Code Bench v2 62.0 (50.9), Program Bench 53.6 (48.3), MLS-Bench-Lite 35.1 (26.7), Kimi Claw 24/7 Bench 46.9 (42.9), MCP-Atlas 76.0 (69.4), MCPMark-Verified 81.1 (72.8) — trailing GPT-5.5 and Claude Opus 4.8 on most but closing the gap. Not yet scored on the AA Intelligence Index.

Model Details

Architecture MOE
Parameters 1T
Active params 32B
Experts 384 (top-8)
Context window 262,144
License Modified MIT
Base model kimi-k2.6

Benchmark Scores

Benchmark Score Mode
Kimi Code Bench v2 62.0
Program Bench 53.6
MLS-Bench-Lite 35.1
Kimi Claw 24/7 Bench 46.9
MCP-Atlas 76.0
MCPMark-Verified 81.1
frontieropen-weightmoecodingagenticreasoningmultimodal

Related