Baichuan-M2
paper modelMedical-enhanced reasoning model built upon Qwen2.5-32B with an innovative Large Verifier System comprising a Patient Simulator and Clinical Rubrics Generator. Trained through multi-stage reinforcement learning with improved GRPO. Outperforms all other open-source models and most closed-source counterparts on HealthBench. Licensed under Apache 2.0.
Outputs 2
Baichuan-M2: Scaling Medical Capability with Large Verifier System
paperTechnical report on the Large Verifier System, multi-stage RL training, and HealthBench evaluation results.
arXiv: 2509.02208
Baichuan-M2-32B
model32-billion-parameter medical reasoning model with quantized GPTQ-Int4 variant available.
Architecture DENSE
Parameters 32B