New paradigm for LLM training via data collaboration. Data owners independently train MoE expert modules on their private data, then contribute experts to a shared model — without sharing raw data. Data can be activated or deactivated at any time, enabling opt-out based on licensing or permissions.

FlexOlmo-7x7B-1T: 33B total MoE combining independently trained experts on public-mix, news, math, code, academic, creative writing, and Reddit data. 41% average relative improvement from combining experts. Apache 2.0.

Model Details

Architecture MOE
Parameters 33B

Paper

arXiv: 2507.07024

Library

GitHub Repository

moeopen-sourceopen-weightdataresearch

Related