AI Lab Tracker
Labs
Timeline
Groma
model
2024-04-19
ByteDance
Multimodal model specialized in localized visual grounding and reasoning.
GitHub
Paper (arXiv)
Project Page
Library
GitHub Repository
multimodal
vision
Notes
Published at ECCV 2024.