Optical Character Recognition (OCR)

311 papers with code • 5 benchmarks • 42 datasets

Optical Character Recognition or Optical Character Reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo, license plates in cars...) or from subtitle text superimposed on an image (for example: from a television broadcast)

Benchmarks

Add a Result

These leaderboards are used to track progress in Optical Character Recognition (OCR)

Dataset	Best Model	Compare
Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study	DTrOCR	See all
FSNS - Test	AttentionOCR_Inception-resnet-v2_Location	See all
I2L-140K	I2L-NOPOOL	See all
SUT	Tesseract	See all
im2latex-100k	I2L-STRIPS	See all

Libraries

Use these libraries to find Optical Character Recognition (OCR) models and implementations

PaddlePaddle/PaddleOCR

18 papers

38,490

open-mmlab/mmocr

6 papers

4,075

alibabaresearch/advancedliteratemac…

5 papers

933

Media-Smart/vedastr

5 papers

531

See all 10 libraries.

Datasets

Subtasks

Irregular Text Recognition

Handwritten Chinese Text Recognition

Offline Handwritten Chinese Character Recognition

Word Spotting In Handwritten Documents

Handwritten Digit Image Synthesis

Grapheme Detection

Most implemented papers

Most implemented Social Latest No code

Chinese Text in the Wild

xiaofengShi/CHINESE-OCR • • 28 Feb 2018

[python3. 6] 运用tf实现自然场景文字检测, keras/pytorch实现ctpn+crnn+ctc实现不定长场景文字OCR识别

Paper
Code

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

tfzhou/ContrastiveSeg • • ICCV 2021

Inspired by the recent advance in unsupervised contrastive representation learning, we propose a pixel-wise contrastive framework for semantic segmentation in the fully supervised setting.

Paper
Code

COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images

xiaofengShi/CHINESE-OCR • • 26 Jan 2016

The goal of COCO-Text is to advance state-of-the-art in text detection and recognition in natural images.

Paper
Code

End-to-End Interpretation of the French Street Name Signs Dataset

xiaofengShi/CHINESE-OCR • • 13 Feb 2017

We introduce the French Street Name Signs (FSNS) Dataset consisting of more than a million images of street name signs cropped from Google Street View images of France.

Paper
Code

DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement

dali92002/DE-GAN • • 17 Oct 2020

Documents often exhibit various forms of degradation, which make it hard to be read and substantially deteriorate the performance of an OCR system.

Paper
Code

OCR-free Document Understanding Transformer

huggingface/transformers • • 30 Nov 2021

Current Visual Document Understanding (VDU) methods outsource the task of reading text to off-the-shelf Optical Character Recognition (OCR) engines and focus on the understanding task with the OCR outputs.

Paper
Code

Attention-based Extraction of Structured Information from Street View Imagery

tensorflow/models • • 11 Apr 2017

We present a neural network model - based on CNNs, RNNs and a novel attention mechanism - which achieves 84. 2% accuracy on the challenging French Street Name Signs (FSNS) dataset, significantly outperforming the previous state of the art (Smith'16), which achieved 72. 46%.

Paper
Code

STN-OCR: A single Neural Network for Text Detection and Text Recognition

Bartzi/stn-ocr • • 27 Jul 2017

In contrast to most existing works that consist of multiple deep neural networks and several pre-processing steps we propose to use a single deep neural network that learns to detect and recognize text from natural images in a semi-supervised way.

Paper
Code

E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text

yash0307/E2E-MLT • • 30 Jan 2018

An end-to-end trainable (fully differentiable) method for multi-language scene text localization and recognition is proposed.

Paper
Code

NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition

PaddlePaddle/PaddleOCR • • 4 Jun 2018

Considering scene image has large variation in text and background, we further design a modality-transform block to effectively transform 2D input images to 1D sequences, combined with the encoder to extract more discriminative features.

Paper
Code

Optical Character Recognition (OCR)

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result