no code implementations • 7 Dec 2023 • Xuying Zhang, Bo-Wen Yin, Yuming Chen, Zheng Lin, Yunheng Li, Qibin Hou, Ming-Ming Cheng
Particularly, a cross-modal graph is constructed to align the object points accurately and noun phrases decoupled from the 3D mesh and textual description.
no code implementations • 8 Oct 2023 • Zhong-Yu Li, Bo-Wen Yin, Yongxiang Liu, Li Liu, Ming-Ming Cheng
Thus, we propose Heterogeneous Self-Supervised Learning (HSSL), which enforces a base model to learn from an auxiliary head whose architecture is heterogeneous from the base model.