TAPTR
library paperTrack Any Point TRansformer series converting TAP to point-level visual prompt detection built on DETR. TAPTRv2 introduces attention-based position update removing cost-volume dependency. TAPTRv3 adds visibility-aware long-temporal attention for robust tracking in long videos.
Outputs 4
TAPTR
libraryOfficial implementation of TAPTR, TAPTRv2, and TAPTRv3 for tracking any point in videos using Transformer-based detection.
TAPTR: Tracking Any Point with Transformers as Detection
paperConverts tracking any point to point-level visual prompt detection built on DETR architecture.
arXiv: 2403.13042
Venue: ECCV 2024
TAPTRv2: Attention-based Position Update Improves Tracking Any Point
paperIntroduces attention-based position update removing cost-volume computation while achieving state-of-the-art performance.
arXiv: 2407.16291
Venue: NeurIPS 2024
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video
paperProposes visibility-aware long-temporal attention and context-aware cross attention for robust point tracking in long videos.
arXiv: 2411.18671
Venue: ICLR 2026