AI Lab Tracker
Labs
Timeline
HaploVL
model
2025-03-18
Tencent
Single-transformer baseline for multi-modal understanding, simplifying vision-language model architecture. Published at ICML 2025.
Paper (arXiv)
GitHub
Paper
arXiv:
2503.14694
Venue:
ICML 2025
multimodal
vision
research