Search Results for author: Mingkun Yang

Found 16 papers, 8 papers with code

Class-Aware Mask-Guided Feature Refinement for Scene Text Recognition

1 code implementation21 Feb 2024 Mingkun Yang, Biao Yang, Minghui Liao, Yingying Zhu, Xiang Bai

By enhancing the alignment between the canonical mask feature and the text feature, the module ensures more effective fusion, ultimately leading to improved recognition performance.

Scene Text Recognition

Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution

1 code implementation12 May 2023 Jianfeng Kuang, Wei Hua, Dingkang Liang, Mingkun Yang, Deqiang Jiang, Bo Ren, Xiang Bai

We evaluate the existing end-to-end methods for VIE on the proposed dataset and observe that the performance of these methods has a distinguishable drop from SROIE (a widely used English dataset) to our proposed dataset due to the larger variance of layout and entities.

Contrastive Learning Optical Character Recognition (OCR)

Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition

1 code implementation1 Jul 2022 Mingkun Yang, Minghui Liao, Pu Lu, Jing Wang, Shenggao Zhu, Hualin Luo, Qi Tian, Xiang Bai

Inspired by the observation that humans learn to recognize the texts through both reading and writing, we propose to learn discrimination and generation by integrating contrastive learning and masked image modeling in our self-supervised method.

Contrastive Learning Scene Text Recognition

Few Could Be Better Than All: Feature Sampling and Grouping for Scene Text Detection

no code implementations CVPR 2022 Jingqun Tang, Wenqing Zhang, Hongye Liu, Mingkun Yang, Bo Jiang, Guanglong Hu, Xiang Bai

Different from previous approaches that learn robust deep representations of scene text in a holistic manner, our method performs scene text detection based on a few representative features, which avoids the disturbance by background and reduces the computational cost.

Ranked #21 on Object Detection In Aerial Images on DOTA (using extra training data)

object-detection Object Detection In Aerial Images +2

Improving OCR-Based Image Captioning by Incorporating Geometrical Relationship

no code implementations CVPR 2021 Jing Wang, Jinhui Tang, Mingkun Yang, Xiang Bai, Jiebo Luo

Under the guidance of the geometrical relationship between OCR tokens, our LSTM-R capitalizes on a newly-devised relation-aware pointer network to select OCR tokens from the scene text for OCR-based image captioning.

Image Captioning Optical Character Recognition (OCR) +1

DeepAVO: Efficient Pose Refining with Feature Distilling for Deep Visual Odometry

no code implementations20 May 2021 Ran Zhu, Mingkun Yang, Wang Liu, Rujun Song, Bo Yan, Zhuoling Xiao

The technology for Visual Odometry (VO) that estimates the position and orientation of the moving object through analyzing the image sequences captured by on-board cameras, has been well investigated with the rising interest in autonomous driving.

Autonomous Driving feature selection +4

Scene Text Retrieval via Joint Text Detection and Similarity Learning

1 code implementation CVPR 2021 Hao Wang, Xiang Bai, Mingkun Yang, Shenggao Zhu, Jing Wang, Wenyu Liu

Such a task is usually realized by matching a query text to the recognized words, outputted by an end-to-end scene text spotter.

Retrieval Scene Text Detection +3

AutoSTR: Efficient Backbone Search for Scene Text Recognition

1 code implementation ECCV 2020 Hui Zhang, Quanming Yao, Mingkun Yang, Yongchao Xu, Xiang Bai

In this work, inspired by the success of neural architecture search (NAS), which can identify better architectures than human-designed ones, we propose automated STR (AutoSTR) to search data-dependent backbones to boost text recognition performance.

Deblurring Neural Architecture Search +1

All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting

no code implementations21 Nov 2019 Hao Wang, Pu Lu, HUI ZHANG, Mingkun Yang, Xiang Bai, Yongchao Xu, Mengchao He, Yongpan Wang, Wenyu Liu

Recently, end-to-end text spotting that aims to detect and recognize text from cluttered images simultaneously has received particularly growing interest in computer vision.

Instance Segmentation Scene Text Detection +3

Symmetry-constrained Rectification Network for Scene Text Recognition

no code implementations ICCV 2019 MingKun Yang, Yushuo Guan, Minghui Liao, Xin He, Kaigui Bian, Song Bai, Cong Yao, Xiang Bai

Reading text in the wild is a very challenging task due to the diversity of text instances and the complexity of natural scenes.

Scene Text Recognition

Deep-Person: Learning Discriminative Deep Features for Person Re-Identification

1 code implementation29 Nov 2017 Xiang Bai, Mingkun Yang, Tengteng Huang, Zhiyong Dou, Rui Yu, Yongchao Xu

Recently, many methods of person re-identification (Re-ID) rely on part-based feature representation to learn a discriminative pedestrian descriptor.

Person Re-Identification Re-Ranking

Integrating Scene Text and Visual Appearance for Fine-Grained Image Classification

no code implementations15 Apr 2017 Xiang Bai, Mingkun Yang, Pengyuan Lyu, Yongchao Xu, Jiebo Luo

Then, we combine the word embedding of the recognized words and the deep visual features into a single representation, which is optimized by a convolutional neural network for fine-grained image classification.

Classification Fine-Grained Image Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.