1 code implementation • 5 Mar 2024 • JianJian Cao, Peng Ye, Shengze Li, Chong Yu, Yansong Tang, Jiwen Lu, Tao Chen
To this end, we propose a novel framework named Multimodal Alignment-Guided Dynamic Token Pruning (MADTP) for accelerating various VLTs.
1 code implementation • 23 Jan 2024 • Shengze Li, JianJian Cao, Peng Ye, Yuhan Ding, Chongjun Tu, Tao Chen
Recently, foundational models such as CLIP and SAM have shown promising performance for the task of Zero-Shot Anomaly Segmentation (ZSAS).
no code implementations • 22 Jan 2024 • JianJian Cao, Beiya Dai, Yulin Li, Xiameng Qin, Jingdong Wang
Holi integrates features of the two modalities by a cross-modal attention mechanism, which suppresses the irrelevant redundancy under the guide of positioning information from RoCo.
1 code implementation • NeurIPS 2023 • Zhenfei Yin, Jiong Wang, JianJian Cao, Zhelun Shi, Dingning Liu, Mukai Li, Lu Sheng, Lei Bai, Xiaoshui Huang, Zhiyong Wang, Jing Shao, Wanli Ouyang
To the best of our knowledge, we present one of the very first open-source endeavors in the field, LAMM, encompassing a Language-Assisted Multi-Modal instruction tuning dataset, framework, and benchmark.
no code implementations • 23 Feb 2023 • Lin Zhan, Jiayuan Fan, Peng Ye, JianJian Cao
To address the above issues, we propose a multi-stage search architecture in order to overcome asymmetric spectral-spatial dimensions and capture significant features.
Hyperspectral Image Classification Neural Architecture Search
no code implementations • 20 Feb 2023 • Jiamu Sheng, Jiayuan Fan, Peng Ye, JianJian Cao
Despite substantial progress in no-reference image quality assessment (NR-IQA), previous training models often suffer from over-fitting due to the limited scale of used datasets, resulting in model performance bottlenecks.
1 code implementation • 14 Dec 2021 • JianJian Cao, Xiameng Qin, Sanyuan Zhao, Jianbing Shen
In this paper, we focus on these two problems and propose a Graph Matching Attention (GMA) network.