Falcon-H1
modelHybrid-head architecture: Transformer attention and Mamba-2 SSM heads run in parallel (concatenated), not interleaved. 0.5B to 34B, 18 languages, 256K context. Up to 4x input throughput and 8x output throughput vs same-size Transformers.
H1-34B matches 70B-class models (Qwen3-32B, Llama3.3-70B). H1-1.5B-Deep rivals 7B-10B models. CC BY 4.0.
Model Details
Architecture DENSE
Parameters 34B
Context window 256,000
Variants
| Name | Parameters | Notes |
|---|---|---|
| Falcon-H1-0.5B | 0.5B | — |
| Falcon-H1-1.5B | 1.5B | — |
| Falcon-H1-1.5B-Deep | 1.5B | — |
| Falcon-H1-3B | 3B | — |
| Falcon-H1-7B | 7B | — |
| Falcon-H1-34B | 34B | — |
Paper
arXiv: 2507.22448