Granite Switch 4.1

Architectural experiment from IBM: a single Granite 4.1 base checkpoint embeds 12 independently-trained activated-LoRA adapters (RAG, safety, explainability, etc.) that the model dynamically selects at inference time via control tokens. IBM calls the pattern "coarse-grained expert switching" — avoids both the N-copies-per-task overhead and the joint-training compromises of standard MoE adapter merging.

Three sizes (3B / 8B / 30B Preview) all built on Granite 4.1 base with 128K context. Apache 2.0. Discovered via HF org probe — no announcement blog at release.

HuggingFace (30B Preview)HuggingFace (8B Preview)GitHub

Model Details

Parameters 30B

Context window 131,072

License Apache 2.0

Base model granite-4.1

Variants

Name	Parameters	Notes
granite-switch-4.1-3b-preview	3B	—
granite-switch-4.1-8b-preview	8B	—
granite-switch-4.1-30b-preview	30B	—

foundationalopen-weightenterprise

Model Details

Variants

Related