Search Results for author: Wenmin Wang

Found 12 papers, 6 papers with code

Shadow Removal of Text Document Images Using Background Estimation and Adaptive Text Enhancement

1 code implementation • ICASSP 2023 • Wenjie Liu, Bingshu Wang, Jiangbin Zheng, Wenmin Wang

Thirdly, we propose an adaptive text contrast enhancement strategy to generate shadow-free results with comfortable visual perception across shadow and non-shadow regions.

Binarization Document Shadow Removal +1

Paper
Code

Diverse Similarity Encoder for Deep GAN Inversion

2 code implementations • 23 Aug 2021 • Cheng Yu, Wenmin Wang

Current deep generative adversarial networks (GANs) can synthesize high-quality (HQ) images, so learning representation with GANs is favorable.

Image Reconstruction

Paper
Code

Brain-Inspired Inference on Missing Video Sequence

no code implementations • 15 Dec 2019 • Weimian Li, Baoyang Chen, Wenmin Wang

By means of integrating different latent variables with learned transformation features, the model could learn more various possible motion modes.

Paper
Add Code

Adaptively Aligned Image Captioning via Adaptive Attention Time

1 code implementation • NeurIPS 2019 • Lun Huang, Wenmin Wang, Yaxian Xia, Jie Chen

In this paper, we propose a novel attention model, namely Adaptive Attention Time (AAT), to align the source and the target adaptively for image captioning.

Image Captioning

Paper
Code

Attention on Attention for Image Captioning

5 code implementations • ICCV 2019 • Lun Huang, Wenmin Wang, Jie Chen, Xiao-Yong Wei

In this paper, we propose an Attention on Attention (AoA) module, which extends the conventional attention mechanisms to determine the relevance between attention results and queries.

Image Captioning

323

Paper
Code

ParNet: Position-aware Aggregated Relation Network for Image-Text matching

no code implementations • 17 Jun 2019 • Yaxian Xia, Lun Huang, Xiao-Yong Wei, Wenmin Wang

The first step, we call it intra-modal relation mechanism, in which we computes responses between different objects in an image or different words in a sentence separately; The second step, we call it inter-modal relation mechanism, in which the query plays a role of textual context to refine the relationship among object proposals in an image.

Image-text matching Position +5

Paper
Add Code

Long-Term Video Interpolation with Bidirectional Predictive Network

no code implementations • 13 Jun 2017 • Xiongtao Chen, Wenmin Wang, Jinzhuo Wang, Weimian Li, Baoyang Chen

In this paper, we present a novel deep architecture called bidirectional predictive network (BiPN) that predicts intermediate frames from two opposite directions.