AgentCPM Series
paperA series of high-performance, "edge-scale" agents (4B-8B) designed for long-horizon reasoning, deep research, and mobile GUI automation. AgentCPM demonstrated that compact models can rival much larger systems in complex agentic tasks through advanced reinforcement learning and search strategies.
Outputs 3
AgentCPM-GUI: Reinforcement Fine-Tuning for Mobile Agents
paperIntroduces mobile-use agents aligned via reinforcement fine-tuning (GRPO), optimizing for low-latency execution on mobile GUIs.
arXiv: 2506.01391
AgentCPM-Explore: Long-Horizon Deep Exploration
paperA 4B-scale agent capable of 100+ rounds of interaction, achieving frontier-level deep search performance by matching Claude-4.5-Sonnet on specific benchmarks.
arXiv: 2602.06485
AgentCPM-Report: Open-Ended Deep Research
paperIntroduces the "Writing As Reasoning Policy" (WARP) for autonomous long-form report generation, interleaving drafting and deep retrieval.
arXiv: 2602.06540