Zyphra

private

usa · Founded 2021 · Valuation: $1B as of 2025-10

Zyphra is a San Francisco- and London-based open AI lab founded in 2021, building hybrid SSM/Transformer foundation models co-designed with non-NVIDIA accelerators. Reached a $1B Series A valuation in October 2025 with backing from AMD Ventures, Intel Capital, IBM, Bison Ventures, Future Ventures, and others (~$110M total raised).

Architectural identity centers on the Zamba family — hybrid Mamba2 + shared-attention models in the 1.2B–7.4B parameter range, designed for low-latency on-device inference with KV-cache footprints up to 6x smaller than pure Transformers. The reasoning successor ZAYA1-8B (May 2026) is an 8B/700M-active MoE trained end-to-end on a full-stack AMD platform (MI300 GPUs), reportedly matching DeepSeek-R1-0528 on math/coding at well under 1B active. It is one of the most public foundation-scale demonstrations on AMD silicon to date.

Zyphra also publishes open training infrastructure (Zyda-2, a 5T-token pretraining dataset built with NVIDIA NeMo Curator) and is extending its hybrid-architecture toolkit into scientific domains via ZUNA, an EEG foundation model trained on 208 harmonized datasets.

Note: as of May 2026, Zyphra is not tracked by Artificial Analysis — benchmark numbers on this page are self-reported from the technical reports.

Website HuggingFace GitHub

frontieropen-weighthybrid-ssmamdon-device

People

Krithik Puthalath — CEO & Chairman, Co-founder
Beren Millidge Google Scholar — Chief Scientist, Co-founder (formerly Apollo Research; Conjecture (Head of Research); Oxford postdoc; Edinburgh PhD)
Tomás Figliolia — Co-founder
Danny Martinelli — Co-founder
Quentin Anthony — Core Researcher (training systems) (formerly EleutherAI)

Outputs (5)

ZAYA1-8B

model

2026-05-06

Zyphra's reasoning successor to Zamba2: an 8B-total / 700M-active MoE trained end-to-end on a full-stack AMD platform — pretrain, midtrain, and…

ZUNA (EEG Foundation Model)

model

2026-02-25

Scientific foundation model for electroencephalography (EEG): a 380M-parameter masked diffusion autoencoder trained on a harmonized corpus of 208 publ…

ZR1-1.5B

model

2025-04-15

Compact reasoning model post-trained from DeepSeek-R1-Distill-Qwen-1.5B using PRIME (Process Reinforcement through IMplicit rEwards) with token-level…

Zamba2 (Hybrid SSM/Transformer Suite)

model

2024-11-22

Zyphra's flagship architectural family. Zamba (May 2024) introduced a compact 7B hybrid combining a Mamba SSM backbone with a single shared attention…

Zyda-2

dataset

2024-10-15

Open 5-trillion-token pretraining dataset built with NVIDIA NeMo Curator. Combines and re-curates several leading open corpora (FineWeb-Edu, DCLM, Dol…