Sarvam-105B | Lab Index

105B total / 10.3B active MoE (128 experts, top-8 routing) with MLA-style attention (decoupled QK head dimensions) and YaRN scaling. 128K context. Trained from scratch on NVIDIA H100s (IndiaAI Mission) for India's 22 scheduled languages + English. Asynchronous GRPO for RL training. AA Intelligence: 18. Apache 2.0.

Blog Post HuggingFace Artificial Analysis

Model Details

Architecture MOE

Parameters 105B

Active params 10.3B

Context window 128,000

AA Intelligence 18

moeopen-weightmultilingual

Model Details

Related