no code implementations • 19 Dec 2023 • Yong Xien Chng, Henry Zheng, Yizeng Han, Xuchong Qiu, Gao Huang
To tackle this challenge, we introduce a novel Mask Grounding auxiliary task that significantly improves visual grounding within language features, by explicitly teaching the model to learn fine-grained correspondence between masked textual tokens and their matching visual objects.
Ranked #2 on Referring Expression Segmentation on RefCOCO testB
no code implementations • 18 Jan 2023 • Rui Huang, Xuran Pan, Henry Zheng, Haojun Jiang, Zhifeng Xie, Shiji Song, Gao Huang
During the pre-training stage, we establish the correspondence of images and point clouds based on the readily available RGB-D data and use contrastive learning to align the image and point cloud representations.
1 code implementation • 31 Jul 2022 • Jan Ondras, Di Ni, Xi Deng, Zeqi Gu, Henry Zheng, Tapomayukh Bhattacharjee
The ability to deform soft objects remains a great challenge for robots due to difficulties in defining the problem mathematically.