Sequence-based science foundation model from Microsoft Research AI for Science that unifies small molecules, materials, proteins, DNA, and RNA for text-driven scientific discovery. Available in 1B, 8B, and 46.7B (8×7B MoE) sizes.

Trained on hundreds of billions of curated tokens from biology, chemistry, and materials science. Enables cross-domain integration tasks combining knowledge across modalities. Top performance on many scientific tasks, matching specialist models, with applications in drug discovery, protein design, material engineering, and RNA design. By Xia, Jin, Xie, and 75+ co-authors at Microsoft Research AI for Science.

Model Details

Architecture MOE
Parameters 46.7B
Active params 13B

Variants

Name Parameters Notes
NatureLM 1B 1B
NatureLM 8B 8B
NatureLM 8x7B 46.7B Mixture of Experts

Paper

arXiv: 2502.07527

scientificmoefoundational