Multilingual frontier at 8B and 32B. Parallel Attention+FFN layers, SwiGLU, RoPE. 128K context, 23 languages. 32B beats Llama 3.1 70B (54% Arena-Hard-Auto win-rate). Combines multilingual data arbitrage, preference training across languages, and safety alignment. CC-BY-NC-4.0.

Model Details

Architecture DENSE
Parameters 32B
Context window 128,000

Variants

Name Parameters Notes
Aya Expanse 8B 8B
Aya Expanse 32B 32B

Paper

arXiv: 2412.04261

open-weightmultilingual

Related