Architectural experiment from IBM: a single Granite 4.1 base checkpoint embeds 12 independently-trained activated-LoRA adapters (RAG, safety, explainability, etc.) that the model dynamically selects at inference time via control tokens. IBM calls the pattern "coarse-grained expert switching" — avoids both the N-copies-per-task overhead and the joint-training compromises of standard MoE adapter merging.

Three sizes (3B / 8B / 30B Preview) all built on Granite 4.1 base with 128K context. Apache 2.0. Discovered via HF org probe — no announcement blog at release.

Model Details

Parameters 30B
Context window 131,072
License Apache 2.0
Base model granite-4.1

Variants

Name Parameters Notes
granite-switch-4.1-3b-preview 3B
granite-switch-4.1-8b-preview 8B
granite-switch-4.1-30b-preview 30B
foundationalopen-weightenterprise

Related