105B total / 10.3B active MoE (128 experts, top-8 routing) with MLA-style attention (decoupled QK head dimensions) and YaRN scaling. 128K context. Trained from scratch on NVIDIA H100s (IndiaAI Mission) for India's 22 scheduled languages + English. Asynchronous GRPO for RL training. AA Intelligence: 18. Apache 2.0.

Model Details

Architecture MOE
Parameters 105B
Active params 10.3B
Context window 128,000
moeopen-weightmultilingual

Related