OLMo | Lab Index

First competitive LLM to release everything: weights, training data (Dolma), training code, logs, and 500+ intermediate checkpoints. 1B and 7B dense Transformers (32 layers, 4096 hidden, SwiGLU, RoPE). Trained on 2-2.46T tokens on Dolma using 256 AMD MI250X GPUs (LUMI) + 27 NVIDIA A100 nodes. Apache 2.0. ACL 2024.

Established the paradigm of fully reproducible open-source LLM research that defined all subsequent OLMo releases.

Paper (arXiv)GitHub HuggingFace

Model Details

Architecture DENSE

Parameters 7B

Context window 2,048

Training tokens 2.46T

Paper

Venue ACL 2024

arXiv HTML

open-sourceopen-weightresearch

Model Details

Paper

Related