Baichuan-M1
paper modelFirst open-source large language model developed from scratch for medical applications. Trained on 20 trillion tokens of both general and medical-specific data using a hybrid tokenizer, curriculum-based training strategy with progressive data complexity, and adaptive gradient clipping. The 14B Instruct variant surpasses Qwen2.5-72B-Instruct in medicine. Released alongside the Baichuan-M1-preview deep thinking model.
Outputs 3
Baichuan-M1: Pushing the Medical Capability of Large Language Models
paperTechnical report on the Baichuan-M1 model series, detailing the from-scratch medical training approach and evaluation results.
arXiv: 2502.12671
Baichuan-M1-14B
model14.5-billion-parameter medical-enhanced model available in Base and Instruct versions.
Architecture DENSE
Parameters 14.5B
Baichuan-M1-preview
modelDeep thinking model with reasoning capabilities across language, vision, and search. Surpassed o1-preview on several benchmarks including mathematics and coding tasks.