LLaMA | Lab Index

"LLaMA: Open and Efficient Foundation Language Models." Dense Transformers from 7B to 65B parameters trained on publicly available data only. LLaMA-65B competitive with Chinchilla-70B and PaLM-540B; LLaMA-13B outperformed GPT-3 (175B) on most benchmarks.

LLaMA proved that smaller models trained on more data (following Chinchilla scaling laws) could match much larger models, catalyzing an explosion of open-source fine-tuning (Alpaca, Vicuna, etc.) and establishing Meta as the leader of the open-weight movement. By Touvron et al.

Paper (arXiv)

Model Details

Architecture DENSE

Parameters 65B

Variants

Name	Parameters	Notes
LLaMA 7B	7B	—
LLaMA 13B	13B	—
LLaMA 33B	33B	—
LLaMA 65B	65B	—

Paper

arXiv HTML

open-weightfoundational

Model Details

Variants

Paper

Related