World's leading Arabic-centric open LLM, developed by the JAIS consortium (G42/Inception + MBZUAI + Cerebras). 13B GPT-3 decoder with ALiBi positions and SwiGLU. Trained on 395B tokens (116B Arabic + 279B English). Best open-source Arabic model at launch.

Later scaled to 30B (1.63T tokens) and 70B (adapted from Llama 2 with 370B Arabic tokens — largest Arabic dataset for an LLM at the time). Named after Jebel Jais, the UAE's highest mountain.

Model Details

Architecture DENSE
Parameters 70B

Variants

Name Parameters Notes
JAIS-13B 13B
JAIS-30B 30B
JAIS-70B 70B Adapted from Llama 2

Paper

arXiv: 2308.16149

open-weightmultilingual

Related