Landmark 1.75 trillion parameter MoE model, at the time the largest in the world, outscaling GPT-3.

Outputs 2

Wu Dao 2.0

model

Landmark 1.75 trillion parameter MoE model, at the time the largest in the world, outscaling GPT-3.

Architecture MOE
Parameters 1.75T

Wu Dao 2.0: A Roadmap to Cognitive Intelligence

paper

Roadmap paper for the Wu Dao 2.0 cognitive intelligence program.

moefrontier

Related