Second generation JAIS. 8B and 70B trained from scratch (not adapted) on 2.6T tokens of Arabic, English, and code using Cerebras CS-2 on Condor Galaxy clusters. Custom Arabic-centric vocabulary (150K tokens), RoPE, Squared-ReLU, muP. Post-trained via SFT (20M+ instruction pairs) → DPO → GRPO.

AraGen-12-24 overall: 70.71% (outperforms Qwen2.5-72B and Llama-3.3-70B on Arabic generative tasks). ~2,000 tokens/sec on Cerebras hardware.

Model Details

Architecture DENSE
Parameters 70B
Context window 8,192

Variants

Name Parameters Notes
JAIS-2-8B 8B
JAIS-2-70B 70B
open-weightmultilingualfrontier

Related