ZAYA1-8B | Lab Index

Zyphra's reasoning successor to Zamba2: an 8B-total / 700M-active MoE trained end-to-end on a full-stack AMD platform — pretrain, midtrain, and SFT all on AMD Instinct MI300 GPUs with AMD networking and software. To Zyphra's knowledge this is the largest publicly released foundation model trained entirely on AMD silicon, and the companion paper documents the systems-level co-design.

Reported benchmarks (with their Markovian RSA test-time compute method): 91.9% AIME'25, 89.6% HMMT'25. Zyphra claims it matches or exceeds DeepSeek-R1-0528 on math and coding despite having under 1B active parameters. Not currently scored on Artificial Analysis — numbers above are self-reported from the technical report.

ZAYA1-8B Technical Report (arXiv)Training Foundation Models on a Full-Stack AMD Platform (arXiv)VentureBeat coverage IBM × AMD × Zyphra partnership

Model Details

Architecture MOE

Parameters 8B

Active params 700M

Training hardware AMD Instinct MI300

Benchmark Scores

Benchmark	Score	Mode
AIME 2025	91.9%	with Markovian RSA
HMMT 2025	89.6%	with Markovian RSA

Paper

arXiv HTML

reasoningopen-weightmoeamd

Model Details

Benchmark Scores

Paper

Related