First competitive LLM to release everything: weights, training data (Dolma), training code, logs, and 500+ intermediate checkpoints. 1B and 7B dense Transformers (32 layers, 4096 hidden, SwiGLU, RoPE). Trained on 2-2.46T tokens on Dolma using 256 AMD MI250X GPUs (LUMI) + 27 NVIDIA A100 nodes. Apache 2.0. ACL 2024.

Established the paradigm of fully reproducible open-source LLM research that defined all subsequent OLMo releases.

Model Details

Architecture DENSE
Parameters 7B
Context window 2,048

Paper

arXiv: 2402.00838

Venue: ACL 2024

open-sourceopen-weightresearch

Related