Expert-parallel load balancer for DeepSeek-V3/R1 that uses a redundant expert strategy to replicate heavy-loaded experts across GPUs for balanced inference. Released as part of DeepSeek Open Source Week.

Library

GitHub Repository

infrastructuremoeopen-source