Is the inside of a vision model at all like a language model? Researchers argue that as the models grow more powerful, they ...
Abstract: Single object tracking aims to localize target object with specific reference modalities (bounding box, natural language or both) in a sequence of specific video modalities (RGB, RGB+Depth, ...
Abstract: 3D single object tracking plays a crucial role in numerous applications such as autonomous driving. Recent trackers based on motion-centric paradigm perform well as they exploit motion cues ...