ASTER: An Attentional Scene Text Recognizer with Flexible Rectification
SCENE text recognition has attracted great interest from the academia and the industry in recent years owing to its importance in a wide range of applications. Despite the maturity of Optical Character Recognition (OCR) systems dedicated to document text, scene text recognition remains a challenging problem. The large variations in background, appearance, and layout pose significant challenges, which the traditional OCR methods cannot handle effectively. Recent advances in scene text recognition are driven by the success of deep learning-based recognition models. Among them are methods that recognize text by characters using convolutional neural networks (CNN), methods that classify words with CNNs [24], [26], and methods that recognize character sequences using a combination of a CNN and a recurrent neural network (RNN) [54]. In spite of their success, these methods do not explicitly address the problem of irregular text, which is text that is not horizontal and frontal, has curved layout, etc. Instances of irregular text frequently appear in natural scenes. As exemplified in Figure 1, typical cases include oriented text, perspective text [49], and curved text. Designed without the invariance to such irregularities, previous methods often struggle in recognizing such text instances.
PDF