no code implementations • 27 Oct 2023 • Zijie Song, Zhenzhen Hu, Richang Hong
Unsupervised representation learning for image clustering is essential in computer vision.
no code implementations • 19 Jul 2023 • Zijie Song, Zhenzhen Hu, Yuanen Zhou, Ye Zhao, Richang Hong, Meng Wang
The crucial issue in this task is to model the global and the local matching between the image and different languages.
1 code implementation • 6 Jan 2022 • Yuanen Zhou, Zhenzhen Hu, Daqing Liu, Huixia Ben, Meng Wang
In this paper, we introduce a Compact Bidirectional Transformer model for image captioning that can leverage bidirectional context implicitly and explicitly while the decoder can be executed parallelly.
1 code implementation • 17 Jun 2021 • Yuanen Zhou, Yong Zhang, Zhenzhen Hu, Meng Wang
To tackle this issue, non-autoregressive image captioning models have recently been proposed to significantly accelerate the speed of inference by generating all words in parallel.
1 code implementation • CVPR 2020 • Yuanen Zhou, Meng Wang, Daqing Liu, Zhenzhen Hu, Hanwang Zhang
To improve the grounding accuracy while retaining the captioning quality, it is expensive to collect the word-region alignment as strong supervision.
no code implementations • 15 Mar 2019 • Lei Chen, Le Wu, Zhenzhen Hu, Meng Wang
To tackle the above two challenges, in this paper, we propose a unified quality-aware GAN-based framework for unpaired image-to-image translation, where a quality-aware loss is explicitly incorporated by comparing each source image and the reconstructed image at the domain level.