Scene Text Recognition
121 papers with code • 15 benchmarks • 27 datasets
See Scene Text Detection for leaderboards in this task.
Libraries
Use these libraries to find Scene Text Recognition models and implementationsMost implemented papers
STN-OCR: A single Neural Network for Text Detection and Text Recognition
In contrast to most existing works that consist of multiple deep neural networks and several pre-processing steps we propose to use a single deep neural network that learns to detect and recognize text from natural images in a semi-supervised way.
TextBoxes++: A Single-Shot Oriented Scene Text Detector
In this paper, we present an end-to-end trainable fast scene text detector, named TextBoxes++, which detects arbitrary-oriented scene text with both high accuracy and efficiency in a single network forward pass.
NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition
Considering scene image has large variation in text and background, we further design a modality-transform block to effectively transform 2D input images to 1D sequences, combined with the encoder to extract more discriminative features.
ASTER: An Attentional Scene Text Recognizer with Flexible Rectification
SCENE text recognition has attracted great interest from the academia and the industry in recent years owing to its importance in a wide range of applications.
Visual Re-ranking with Natural Language Understanding for Text Spotting
We propose a post-processing approach to improve scene text recognition accuracy by using occurrence probabilities of words (unigram language model), and the semantic correlation between scene and text.
UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World
Synthetic data has been a critical tool for training scene text detection and recognition models.
SPIN: Structure-Preserving Inner Offset Network for Scene Text Recognition
Arbitrary text appearance poses a great challenge in scene text recognition tasks.
Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
Additionally, based on the ensemble of iterative predictions, we propose a self-training method which can learn from unlabeled images effectively.
Primitive Representation Learning for Scene Text Recognition
In this paper, we propose a primitive representation learning method that aims to exploit intrinsic representations of scene text images.
Vision Transformer for Fast and Efficient Scene Text Recognition
On a comparable strong baseline method such as TRBA with accuracy of 84. 3%, our small ViTSTR achieves a competitive accuracy of 82. 6% (84. 2% with data augmentation) at 2. 4x speed up, using only 43. 4% of the number of parameters and 42. 2% FLOPS.