1 code implementation • 3 Feb 2024 • Wenjia Xu, Jiuniu Wang, Zhiwei Wei, Mugen Peng, Yirong Wu
Besides, pioneer ZSL models use convolutional neural networks pre-trained on ImageNet, which focus on the main objects appearing in each image, neglecting the background context that also matters in RS scene classification.
no code implementations • 12 Oct 2023 • Haoling Li, Jiuniu Wang, Zhiwei Wei, Wenjia Xu
Our GLVL network is a two-stage visual localization approach, combining a large-scale retrieval module that finds similar regions with the UAV flight scene, and a fine-grained matching module that localizes the precise UAV coordinate, enabling real-time and precise localization.
no code implementations • 17 Aug 2023 • Song Zhang, Wenjia Xu, Zhiwei Wei, Lili Zhang, Yang Wang, Junyi Liu
Moreover, our method also achieves the lowest $e_{1}$ and $e_{3}$ on the BlendedMVS dataset and the highest Acc and $F_{1}$-score on the ETH 3D dataset, surpassing all listed methods. Project website: https://github. com/zs670980918/ARAI-MVSNet
no code implementations • 19 Apr 2023 • Zhiwei Wei, Yi Xiao, Wenjia Xu, Mi Shu, Lu Cheng, Yang Wang, Chunbo Liu
To improve efficiency and effectiveness, we integrate multi-scale data using a knowledge graph, focusing on the recognition of C-shaped building patterns.
no code implementations • 8 Aug 2022 • Youyuan Zhang, Jiuniu Wang, Hao Wu, Wenjia Xu
Image captioning models are usually trained according to human annotated ground-truth captions, which could generate accurate but generic captions.
1 code implementation • 29 Jul 2022 • Zaiquan Yang, Yang Liu, Wenjia Xu, Chong Huang, Lei Zhou, Chao Tong
Specifically, we combine seen classes to hallucinate new classes which play as placeholders of the unseen classes in the visual and semantic space.
no code implementations • 18 Jul 2022 • Wenjia Xu, Jiuniu Wang, Yirong Wu
In this paper, we propose a Multi-dimension Feature Learning Model~(MDFL) using high-dimensional GBD data in conjunction with RS images for urban region function recognition.
no code implementations • 8 Apr 2022 • Jiuniu Wang, Wenjia Xu, Qingzhong Wang, Antoni B. Chan
First, we propose a distinctiveness metric -- between-set CIDEr (CIDErBtw) to evaluate the distinctiveness of a caption with respect to those of similar images.
no code implementations • 4 Apr 2022 • Wenjia Xu, Yongqin Xian, Jiuniu Wang, Bernt Schiele, Zeynep Akata
While a visual-semantic embedding layer learns global features, local features are learned through an attribute prototype network that simultaneously regresses and decorrelates attributes from intermediate features.
Ranked #5 on GZSL Video Classification on ActivityNet-GZSL(main)
1 code implementation • CVPR 2022 • Wenjia Xu, Yongqin Xian, Jiuniu Wang, Bernt Schiele, Zeynep Akata
Our model visually divides a set of images from seen classes into clusters of local image regions according to their visual similarity, and further imposes their class discrimination and semantic relatedness.
1 code implementation • 2 Nov 2021 • Yao Rong, Wenjia Xu, Zeynep Akata, Enkelejda Kasneci
The way humans attend to, process and classify a given image has the potential to vastly benefit the performance of deep learning models.
Ranked #41 on Fine-Grained Image Classification on CUB-200-2011
no code implementations • 20 Aug 2021 • Jiuniu Wang, Wenjia Xu, Qingzhong Wang, Antoni B. Chan
In particular, we propose a group-based memory attention (GMA) module, which stores object features that are unique among the image group (i. e., with low similarity to objects in other images).
no code implementations • 1 Oct 2020 • Wenjia Xu, Guangluan Xu, Yang Wang, Xian Sun, Daoyu Lin, Yirong Wu
Single image super-resolution is an effective way to enhance the spatial resolution of remote sensing image, which is crucial for many applications such as target detection and image classification.
no code implementations • 29 Sep 2020 • Wenjia Xu, Jiuniu Wang, Yang Wang, Guangluan Xu, Wei Dai, Yirong Wu
We generate attribute-based textual explanations for the network and ground the attributes on the image to show visual explanations.
1 code implementation • 2 Sep 2020 • Jiuniu Wang, Wenjia Xu, Xingyu Fu, Guangluan Xu, Yirong Wu
Under such circumstances, how to make full use of the information extracted by word embedding requires more in-depth research.
1 code implementation • 2 Sep 2020 • Jiuniu Wang, Wenjia Xu, Xingyu Fu, Yang Wei, Li Jin, Ziyan Chen, Guangluan Xu, Yirong Wu
This model enhances the question answering system in the multi-document scenario from three aspects: model structure, optimization goal, and training method, corresponding to Multilayer Attention (MA), Cross Evidence (CE), and Adversarial Training (AT) respectively.
no code implementations • NeurIPS 2020 • Wenjia Xu, Yongqin Xian, Jiuniu Wang, Bernt Schiele, Zeynep Akata
As an additional benefit, our model points to the visual evidence of the attributes in an image, e. g. for the CUB dataset, confirming the improved attribute localization ability of our image representation.
no code implementations • ECCV 2020 • Jiuniu Wang, Wenjia Xu, Qingzhong Wang, Antoni B. Chan
A wide range of image captioning models has been developed, achieving significant improvement based on popular metrics, such as BLEU, CIDEr, and SPICE.