1 code implementation • 16 May 2024 • Jiancheng Pan, Muyuan Ma, Qing Ma, Cong Bai, ShengYong Chen
Continuing with the above, we propose PIR-CLIP, a domain-specific CLIP-based framework for remote sensing image-text retrieval, to address semantic noise in remote sensing vision-language representations and further improve open-domain retrieval performance.
1 code implementation • ACMMM 2023 • Jiancheng Pan, Qing Ma, Cong Bai
Our highlight is the proposal of a paradigm that draws on prior knowledge to instruct adaptive learning of vision and text representations.
Ranked #5 on Cross-Modal Retrieval on RSICD
no code implementations • 12 Oct 2023 • Qing Ma, Jiancheng Pan, Cong Bai
Our highlight is to conduct visual and textual representations in latent space, directing them as close as possible to a redundancy-free regional visual representation.
Ranked #6 on Cross-Modal Retrieval on RSITMD
2 code implementations • ICMR 2023 • Jiancheng Pan, Qing Ma, Cong Bai
Furthermore, as the diversity and differentiation of remote sensing scenes weaken the understanding of scenes, a new metric, namely, scene recall is proposed to measure the perception of scenes by evaluating scene-level retrieval performance, which can also verify the effectiveness of our approach in reducing semantic confusion.
Ranked #7 on Cross-Modal Retrieval on RSICD