Aya Expanse
modelMultilingual frontier at 8B and 32B. Parallel Attention+FFN layers, SwiGLU, RoPE. 128K context, 23 languages. 32B beats Llama 3.1 70B (54% Arena-Hard-Auto win-rate). Combines multilingual data arbitrage, preference training across languages, and safety alignment. CC-BY-NC-4.0.
Model Details
Architecture DENSE
Parameters 32B
Context window 128,000
Variants
| Name | Parameters | Notes |
|---|---|---|
| Aya Expanse 8B | 8B | — |
| Aya Expanse 32B | 32B | — |
Paper
arXiv: 2412.04261