First production-grade SSM-Transformer-MoE hybrid architecture. Interleaves Mamba state-space layers with Transformer attention layers, combined with mixture-of-experts routing. 256K token context. Fits on a single 80GB GPU despite its scale.

The hybrid architecture achieves 2.5x faster inference than equivalent dense Transformers. Jamba 1.7 Large (398B total / 94B active) is the current variant. AA Intelligence Index: 11. The Jamba architecture is a genuine contribution to the field — demonstrating that SSMs and attention are complementary, not competing. Apache 2.0 (Jamba v0.1). By Lieber, Lenz, Shoham et al.

Model Details

Architecture MOE
Parameters 398B
Active params 94B
Context window 256,000

Variants

Name Parameters Notes
Jamba v0.1 52B Original release, Apache 2.0
Jamba 1.5 Mini 52B
Jamba 1.7 Large 398B

Paper

arXiv: 2403.19887

foundationalopen-weightmoe