May. 2nd, 2024: Vision Mamba (Vim) is accepted by ICML2024. 🎉 Conference page can be found here. Feb. 10th, 2024: We update Vim-tiny/small weights and training scripts. By placing the class token at ...
Abstract: Real-time dense mapping with high-fidelity textures in large-scale environments is such a challenge in robots, digital twins, and AR/VR applications. Neural Radiance Field (NeRF) has ...
Abstract: Visual object navigation is one of the key tasks in mobile robotics. One of the most important components of this task is the accurate semantic representation of the scene, which is needed ...
To address the degradation of visual-language (VL) representations during VLA supervised fine-tuning (SFT), we introduce Visual Representation Alignment. During SFT, we pull a VLA’s visual tokens ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results