Hybrid-head architecture: Transformer attention and Mamba-2 SSM heads run in parallel (concatenated), not interleaved. 0.5B to 34B, 18 languages, 256K context. Up to 4x input throughput and 8x output throughput vs same-size Transformers.

H1-34B matches 70B-class models (Qwen3-32B, Llama3.3-70B). H1-1.5B-Deep rivals 7B-10B models. CC BY 4.0.

Model Details

Architecture DENSE
Parameters 34B
Context window 256,000

Variants

Name Parameters Notes
Falcon-H1-0.5B 0.5B
Falcon-H1-1.5B 1.5B
Falcon-H1-1.5B-Deep 1.5B
Falcon-H1-3B 3B
Falcon-H1-7B 7B
Falcon-H1-34B 34B

Paper

arXiv: 2507.22448

open-weightarchitectureefficiencyfrontier

Related