Yuan 3.0 Ultra
modelTrillion-parameter enterprise MoE flagship (1.01T total / 68.8B active). Originally 1.5T parameters, pruned using Layer-Adaptive Expert Pruning (LAEP) which improved training efficiency by 49% and inference speed by 33%. Achieves 93.1% MATH-500, 91.4% HumanEval, 87.8% MMLU. Leads in enterprise RAG, complex table understanding, and 64K context processing.
Model Details
Architecture MOE
Parameters 1T
Active params 68.8B
Paper
arXiv: 2601.14327