Search Results for author: Chengquan Zhang

Found 22 papers, 8 papers with code

GridFormer: Towards Accurate Table Structure Recognition via Grid Prediction

no code implementations • 26 Sep 2023 • Pengyuan Lyu, Weihong Ma, Hongyi Wang, Yuechen Yu, Chengquan Zhang, Kun Yao, Yang Xue, Jingdong Wang

In this representation, the vertexes and edges of the grid store the localization and adjacency information of the table.

Paper
Add Code

Towards Robust Real-Time Scene Text Detection: From Semantic to Instance Representation Learning

no code implementations • 14 Aug 2023 • Xugong Qin, Pengyuan Lyu, Chengquan Zhang, Yu Zhou, Kun Yao, Peng Zhang, Hailun Lin, Weiping Wang

Different from existing methods which integrate multiple-granularity features or multiple outputs, we resort to the perspective of representation learning in which auxiliary tasks are utilized to enable the encoder to jointly learn robust features with the main task of per-pixel classification during optimization.

Representation Learning Scene Text Detection +1

Paper
Add Code

MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary

no code implementations • 24 Jul 2023 • Beiya Dai, Xing Li, Qunyi Xie, Yulin Li, Xiameng Qin, Chengquan Zhang, Kun Yao, Junyu Han

To produce a comprehensive evaluation of MataDoc, we propose a novel benchmark ArbDoc, mainly consisting of document images with arbitrary boundaries in four typical scenarios.

document understanding Optical Character Recognition (OCR)

Paper
Add Code

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

no code implementations • 5 Jun 2023 • Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, MingYu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun, Jingdong Wang, Xiang Bai

It is hoped that this competition will attract many researchers in the field of CV and NLP, and bring some new thoughts to the field of Document AI.

Document AI Entity Linking +1

Paper
Add Code

Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding

no code implementations • 19 May 2023 • Mingliang Zhai, Yulin Li, Xiameng Qin, Chen Yi, Qunyi Xie, Chengquan Zhang, Kun Yao, Yuwei Wu, Yunde Jia

Transformers achieve promising performance in document understanding because of their high effectiveness and still suffer from quadratic computational complexity dependency on the sequence length.

document understanding

Paper
Add Code

StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training

1 code implementation • 1 Mar 2023 • Yuechen Yu, Yulin Li, Chengquan Zhang, Xiaoqiang Zhang, Zengyuan Guo, Xiameng Qin, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang

Compared to the masked multi-modal modeling methods for document image understanding that rely on both the image and text modalities, StrucTexTv2 models image-only input and potentially deals with more application scenarios free from OCR pre-processing.

Ranked #1 on Table Recognition on WTW

Document Image Classification Language Modelling +3

481

Paper
Code

TRUST: An Accurate and End-to-End Table structure Recognizer Using Splitting-based Transformers

no code implementations • 31 Aug 2022 • Zengyuan Guo, Yuechen Yu, Pengyuan Lv, Chengquan Zhang, Haojie Li, Zhihui Wang, Kun Yao, Jingtuo Liu, Jingdong Wang

The Vertex-based Merging Module is capable of aggregating local contextual information between adjacent basic grids, providing the ability to merge basic girds that belong to the same spanning cell accurately.

Ranked #5 on Table Recognition on PubTabNet

Table Recognition

Paper
Add Code

Single Shot Self-Reliant Scene Text Spotter by Decoupled yet Collaborative Detection and Recognition

1 code implementation • 15 Jul 2022 • Jingjing Wu, Pengyuan Lyu, Guangming Lu, Chengquan Zhang, Wenjie Pei

Typical text spotters follow the two-stage spotting paradigm which detects the boundary for a text instance first and then performs text recognition within the detected regions.

Ranked #5 on Text Spotting on ICDAR 2015

Text Detection Text Spotting

Paper
Code

MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining

no code implementations • 1 Jun 2022 • Pengyuan Lyu, Chengquan Zhang, Shanshan Liu, Meina Qiao, Yangliu Xu, Liang Wu, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang

Specifically, we transform text data into synthesized text images to unify the data modalities of vision and language, and enhance the language modeling capability of the sequence decoder using a proposed masked image-language modeling scheme.

Ranked #2 on Optical Character Recognition (OCR) on Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study

Language Modelling Optical Character Recognition (OCR) +1

Paper
Add Code

StrucTexT: Structured Text Understanding with Multi-Modal Transformers

1 code implementation • 6 Aug 2021 • Yulin Li, Yuxi Qian, Yuchen Yu, Xiameng Qin, Chengquan Zhang, Yan Liu, Kun Yao, Junyu Han, Jingtuo Liu, Errui Ding

Due to the complexity of content and layout in VRDs, structured text understanding has been a challenging task.

Entity Linking Language Modelling +1

481

Paper
Code

PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network

2 code implementations • 12 Apr 2021 • Pengfei Wang, Chengquan Zhang, Fei Qi, Shanshan Liu, Xiaoqiang Zhang, Pengyuan Lyu, Junyu Han, Jingtuo Liu, Errui Ding, Guangming Shi

With a PG-CTC decoder, we gather high-level character classification vectors from two-dimensional space and decode them into text symbols without NMS and RoI operations involved, which guarantees high efficiency.

Ranked #1 on Scene Text Detection on ICDAR 2015 (Accuracy metric)

Optical Character Recognition (OCR) Scene Text Detection +1

38,505

Paper
Code

Learning Global Structure Consistency for Robust Object Tracking

no code implementations • 26 Aug 2020 • Bi Li, Chengquan Zhang, Zhibin Hong, Xu Tang, Jingtuo Liu, Junyu Han, Errui Ding, Wenyu Liu

Unlike many existing trackers that focus on modeling only the target, in this work, we consider the \emph{transient variations of the whole scene}.

Object Visual Object Tracking

Paper
Add Code

Towards Accurate Scene Text Recognition with Semantic Reasoning Networks

2 code implementations • CVPR 2020 • Deli Yu, Xuan Li, Chengquan Zhang, Junyu Han, Jingtuo Liu, Errui Ding

Scene text image contains two levels of contents: visual texture and semantic information.

Ranked #4 on Optical Character Recognition (OCR) on Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study

Optical Character Recognition (OCR) Scene Text Recognition

38,505

Paper
Code

An End-to-end Video Text Detector with Online Tracking

no code implementations • 20 Aug 2019 • Hongyuan Yu, Chengquan Zhang, Xuan Li, Junyu Han, Errui Ding, Liang Wang

Most existing methods attempt to enhance the performance of video text detection by cooperating with video text tracking, but treat these two tasks separately.

Text Detection

Paper
Add Code

A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning

1 code implementation • 15 Aug 2019 • Pengfei Wang, Chengquan Zhang, Fei Qi, Zuming Huang, Mengyi En, Junyu Han, Jingtuo Liu, Errui Ding, Guangming Shi

Detecting scene text of arbitrary shapes has been a challenging task over the past years.

Ranked #18 on Scene Text Detection on ICDAR 2015

Multi-Task Learning Optical Character Recognition (OCR) +2

38,505

Paper
Code

Editing Text in the Wild

2 code implementations • 8 Aug 2019 • Liang Wu, Chengquan Zhang, Jiaming Liu, Junyu Han, Jingtuo Liu, Errui Ding, Xiang Bai

Specifically, we propose an end-to-end trainable style retention network (SRNet) that consists of three modules: text conversion module, background inpainting module and fusion module.

Ranked #1 on Image Inpainting on StreetView

Image Inpainting Image-to-Image Translation +1

220

Paper
Code

Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes

no code implementations • CVPR 2019 • Chengquan Zhang, Borong Liang, Zuming Huang, Mengyi En, Junyu Han, Errui Ding, Xinghao Ding

Previous scene text detection methods have progressed substantially over the past years.

Scene Text Detection Text Detection

Paper
Add Code

Detecting Text in the Wild with Deep Character Embedding Network

no code implementations • 2 Jan 2019 • Jiaming Liu, Chengquan Zhang, Yipeng Sun, Junyu Han, Errui Ding

However, text in the wild is usually perspectively distorted or curved, which can not be easily tackled by existing approaches.

Clustering Text Detection

Paper
Add Code

TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network

no code implementations • 24 Dec 2018 • Yipeng Sun, Chengquan Zhang, Zuming Huang, Jiaming Liu, Junyu Han, Errui Ding

Reading text from images remains challenging due to multi-orientation, perspective distortion and especially the curved nature of irregular text.

Optical Character Recognition (OCR) Text Detection

Paper
Add Code

WordSup: Exploiting Word Annotations for Character based Text Detection

no code implementations • ICCV 2017 • Han Hu, Chengquan Zhang, Yuxuan Luo, Yuzhuo Wang, Junyu Han, Errui Ding

When applied in scene text detection, we are thus able to train a robust character detector by exploiting word annotations in the rich large-scale real scene text datasets, e. g. ICDAR15 and COCO-text.

Ranked #4 on Scene Text Detection on ICDAR 2013