Wu Dao 2.0

Landmark 1.75 trillion parameter MoE model, at the time the largest in the world, outscaling GPT-3.

Outputs 2

model

Landmark 1.75 trillion parameter MoE model, at the time the largest in the world, outscaling GPT-3.

Architecture MOE

Parameters 1.75T

paper

Roadmap paper for the Wu Dao 2.0 cognitive intelligence program.

moefrontier