Search Results for author: Mingkun Yang

Found 16 papers, 8 papers with code

Sequential Visual and Semantic Consistency for Semi-supervised Text Recognition

no code implementations • 24 Feb 2024 • Mingkun Yang, Biao Yang, Minghui Liao, Yingying Zhu, Xiang Bai

Scene text recognition (STR) is a challenging task that requires large-scale annotated data for training.

Scene Text Recognition Semantic Similarity +1

Paper
Add Code

Class-Aware Mask-Guided Feature Refinement for Scene Text Recognition

1 code implementation • 21 Feb 2024 • Mingkun Yang, Biao Yang, Minghui Liao, Yingying Zhu, Xiang Bai

By enhancing the alignment between the canonical mask feature and the text feature, the module ensures more effective fusion, ultimately leading to improved recognition performance.

Scene Text Recognition

Paper
Code

Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution

1 code implementation • 12 May 2023 • Jianfeng Kuang, Wei Hua, Dingkang Liang, Mingkun Yang, Deqiang Jiang, Bo Ren, Xiang Bai

We evaluate the existing end-to-end methods for VIE on the proposed dataset and observe that the performance of these methods has a distinguishable drop from SROIE (a widely used English dataset) to our proposed dataset due to the larger variance of layout and entities.

Contrastive Learning Optical Character Recognition (OCR)

Paper
Code

Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition

1 code implementation • 1 Jul 2022 • Mingkun Yang, Minghui Liao, Pu Lu, Jing Wang, Shenggao Zhu, Hualin Luo, Qi Tian, Xiang Bai

Inspired by the observation that humans learn to recognize the texts through both reading and writing, we propose to learn discrimination and generation by integrating contrastive learning and masked image modeling in our self-supervised method.

Contrastive Learning Scene Text Recognition

Paper
Code

Few Could Be Better Than All: Feature Sampling and Grouping for Scene Text Detection

no code implementations • CVPR 2022 • Jingqun Tang, Wenqing Zhang, Hongye Liu, Mingkun Yang, Bo Jiang, Guanglong Hu, Xiang Bai

Different from previous approaches that learn robust deep representations of scene text in a holistic manner, our method performs scene text detection based on a few representative features, which avoids the disturbance by background and reduces the computational cost.

Ranked #21 on Object Detection In Aerial Images on DOTA (using extra training data)

object-detection Object Detection In Aerial Images +2

Paper
Add Code

Improving OCR-Based Image Captioning by Incorporating Geometrical Relationship

no code implementations • CVPR 2021 • Jing Wang, Jinhui Tang, Mingkun Yang, Xiang Bai, Jiebo Luo

Under the guidance of the geometrical relationship between OCR tokens, our LSTM-R capitalizes on a newly-devised relation-aware pointer network to select OCR tokens from the scene text for OCR-based image captioning.

Image Captioning Optical Character Recognition (OCR) +1

Paper
Add Code

DeepAVO: Efficient Pose Refining with Feature Distilling for Deep Visual Odometry

no code implementations • 20 May 2021 • Ran Zhu, Mingkun Yang, Wang Liu, Rujun Song, Bo Yan, Zhuoling Xiao

The technology for Visual Odometry (VO) that estimates the position and orientation of the moving object through analyzing the image sequences captured by on-board cameras, has been well investigated with the rising interest in autonomous driving.

Autonomous Driving feature selection +4

Paper
Add Code

Scene Text Retrieval via Joint Text Detection and Similarity Learning

1 code implementation • CVPR 2021 • Hao Wang, Xiang Bai, Mingkun Yang, Shenggao Zhu, Jing Wang, Wenyu Liu

Such a task is usually realized by matching a query text to the recognized words, outputted by an end-to-end scene text spotter.

Retrieval Scene Text Detection +3

Paper
Code

AutoSTR: Efficient Backbone Search for Scene Text Recognition

1 code implementation • ECCV 2020 • Hui Zhang, Quanming Yao, Mingkun Yang, Yongchao Xu, Xiang Bai

In this work, inspired by the success of neural architecture search (NAS), which can identify better architectures than human-designed ones, we propose automated STR (AutoSTR) to search data-dependent backbones to boost text recognition performance.

Deblurring Neural Architecture Search +1

Paper
Code

ICDAR 2019 Robust Reading Challenge on Reading Chinese Text on Signboard

no code implementations • 20 Dec 2019 • Xi Liu, Rui Zhang, Yongsheng Zhou, Qianyi Jiang, Qi Song, Nan Li, Kai Zhou, Lei Wang, Dong Wang, Minghui Liao, Mingkun Yang, Xiang Bai, Baoguang Shi, Dimosthenis Karatzas, Shijian Lu, C. V. Jawahar

21 teams submit results for Task 1, 23 teams submit results for Task 2, 24 teams submit results for Task 3, and 13 teams submit results for Task 4.

Line Detection Task 2

Paper
Add Code

All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting

no code implementations • 21 Nov 2019 • Hao Wang, Pu Lu, HUI ZHANG, Mingkun Yang, Xiang Bai, Yongchao Xu, Mengchao He, Yongpan Wang, Wenyu Liu

Recently, end-to-end text spotting that aims to detect and recognize text from cluttered images simultaneously has received particularly growing interest in computer vision.

Instance Segmentation Scene Text Detection +3

Paper
Add Code

Symmetry-constrained Rectification Network for Scene Text Recognition

no code implementations • ICCV 2019 • MingKun Yang, Yushuo Guan, Minghui Liao, Xin He, Kaigui Bian, Song Bai, Cong Yao, Xiang Bai

Reading text in the wild is a very challenging task due to the diversity of text instances and the complexity of natural scenes.

Scene Text Recognition

Paper
Add Code

ASTER: An Attentional Scene Text Recognizer with Flexible Rectification

3 code implementations • good 2018 • Baoguang Shi, Mingkun Yang, Xinggang Wang, Pengyuan Lyu, Cong Yao, and Xiang Bai

SCENE text recognition has attracted great interest from the academia and the industry in recent years owing to its importance in a wide range of applications.

Ranked #21 on Scene Text Recognition on ICDAR2015

Optical Character Recognition Optical Character Recognition (OCR) +1

714

Paper
Code

Deep-Person: Learning Discriminative Deep Features for Person Re-Identification

1 code implementation • 29 Nov 2017 • Xiang Bai, Mingkun Yang, Tengteng Huang, Zhiyong Dou, Rui Yu, Yongchao Xu

Recently, many methods of person re-identification (Re-ID) rely on part-based feature representation to learn a discriminative pedestrian descriptor.

Person Re-Identification Re-Ranking

Paper
Code

ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17)

5 code implementations • 31 Aug 2017 • Baoguang Shi, Cong Yao, Minghui Liao, Mingkun Yang, Pei Xu, Linyan Cui, Serge Belongie, Shijian Lu, Xiang Bai

This report introduces RCTW, a new competition that focuses on Chinese text reading.

valid

2,894

Paper
Code

Integrating Scene Text and Visual Appearance for Fine-Grained Image Classification

no code implementations • 15 Apr 2017 • Xiang Bai, Mingkun Yang, Pengyuan Lyu, Yongchao Xu, Jiebo Luo

Then, we combine the word embedding of the recognized words and the deep visual features into a single representation, which is optimized by a convolutional neural network for fine-grained image classification.

Classification Fine-Grained Image Classification +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.