no code implementations • 20 Jan 2024 • Xianbing Zhao, Soujanya Poria, Xuejiao Li, Yixin Chen, Buzhou Tang
Recently, CLIP-based multimodal foundational models have demonstrated impressive performance on numerous multimodal tasks by learning the aligned cross-modal semantics of image and text pairs, but the multimodal foundational models are also unable to directly address scenarios involving modality absence.
no code implementations • 19 Oct 2021 • He Li, Shiyu Zhang, Xuejiao Li, Liangcai Su, Hongjie Huang, Duo Jin, Linghao Chen, Jianbing Huang, Jaesoo Yoo
Detectors with high coverage have direct and far-reaching benefits for road users in route planning and avoiding traffic congestion, but utilizing these data presents unique challenges including: the dynamic temporal correlation, and the dynamic spatial correlation caused by changes in road conditions.