Tri-modal foundation model unifying histology imaging, spatial transcriptomics, and biological language for spatial biology and pathology-related reasoning. 8B parameters, with a histology image encoder, a gene-aware transcriptomic branch (NicheFormer + Gene Q-Former + Gene Projector), and a language-model backbone tying the modalities together.

Targets image-only reasoning, gene-only reasoning, joint image+gene reasoning, and natural-language biomedical interpretation in a single model. Apache 2.0; companion manuscript marked "in preparation" (Xiao et al., 2026).

Model Details

Parameters 8B
License Apache 2.0
foundation-modelsciencemultimodalopen-weightopen-source