12B dense model built in collaboration with NVIDIA. 128K context. MMLU: 68.0%. Outperforms Gemma 2 9B and Llama 3 8B. Apache 2.0.

Model Details

Architecture DENSE
Parameters 12B
Context window 128,000
open-weight