no code implementations • 21 Mar 2024 • Yong He, Hongshan Yu, Muhammad Ibrahim, Xiaoyan Liu, Tongjia Chen, Anwaar Ulhaq, Ajmal Mian
This strategy allows various transformer blocks to share the same position information over the same resolution points, thereby reducing network parameters and training time without compromising accuracy. Experimental comparisons with existing methods on multiple datasets demonstrate the efficacy of SMTransformer and skip-attention-based up-sampling for point cloud processing tasks, including semantic segmentation and classification.
1 code implementation • 30 Nov 2023 • Tongjia Chen, Hongshan Yu, Zhengeng Yang, Zechuan Li, Wei Sun, Chen Chen
Due to the resource-intensive nature of training vision-language models on expansive video data, a majority of studies have centered on adapting pre-trained image-language models to the video domain.
Ranked #2 on Zero-Shot Action Recognition on Kinetics
no code implementations • 6 Aug 2023 • Wei Miao, Hong Zhao, Tongjia Chen, Wei Huang, Changyan Xiao
Recent stereo matching networks achieves dramatic performance by introducing epipolar line constraint to limit the matching range of dual-view.
1 code implementation • CVPR 2023 • Zechuan Li, Hongshan Yu, Zhengeng Yang, Tongjia Chen, Naveed Akhtar
In this work, we propose AShapeFormer, a semantics-guided object-level shape encoding module for 3D object detection.