Reasoning-focused model using reinforcement learning, claimed to match OpenAI's o1-preview in math and coding. Technical report details scaling RL with LLMs.

Outputs 2

Kimi k1.5 Model

model

Kimi k1.5 Tech Report: Scaling RL with LLMs

paper

arXiv: 2501.12599

reasoningtraining