Baichuan-M2 | Lab Index

Medical-enhanced reasoning model built upon Qwen2.5-32B with an innovative Large Verifier System comprising a Patient Simulator and Clinical Rubrics Generator. Trained through multi-stage reinforcement learning with improved GRPO. Outperforms all other open-source models and most closed-source counterparts on HealthBench. Licensed under Apache 2.0.

Paper (arXiv)GitHub HuggingFace

Outputs 2

Baichuan-M2: Scaling Medical Capability with Large Verifier System

paper

Technical report on the Large Verifier System, multi-stage RL training, and HealthBench evaluation results.

Paper (arXiv)

Citations 1

arXiv HTML

Baichuan-M2-32B

model

32-billion-parameter medical reasoning model with quantized GPTQ-Int4 variant available.

HuggingFace HuggingFace (GPTQ-Int4)GitHub

Architecture DENSE

Parameters 32B

open-weightbiologyreasoningtraining

Your notes

Outputs 2

Baichuan-M2: Scaling Medical Capability with Large Verifier System

Baichuan-M2-32B