MoE vision-language model for advanced multimodal understanding.

Paper

arXiv: 2412.10302

multimodalopen-weightmoe

Related