AI Lab Tracker
Labs
Timeline
ViT-Lens
paper
2023-08-20
Tencent
Towards omni-modal representations by extending ViT to additional modalities (3D, audio, etc.) via lightweight lens modules. Published at CVPR 2024.
Paper v1 (arXiv)
Paper v2 (arXiv)
GitHub
Project Page
Paper
arXiv:
2311.16081
Venue:
CVPR 2024
multimodal
vision
research