718B sparse MoE with 256 experts per layer and 39B active parameters. The flagship model announced at HDC 2025, later open-sourced as part of the openPangu initiative.

Outputs 2

Pangu Ultra MoE 718B

model
Architecture MOE
Parameters 718B
Active params 39B

Pangu Ultra MoE: A Scalable Mixture-of-Experts Framework

paper

arXiv: 2505.04519

moefrontieropen-weight

Related