TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting

Most existing text spotting methods either focus on horizontal/oriented texts or perform arbitrary shaped text spotting with character-level annotations. In this paper, we propose a novel text spotting framework to detect and recognize text of arbitrary shapes in an end-to-end manner, using only word/line-level annotations for training. Motivated from the name of TextSnake, which is only a detection model, we call the proposed text spotting framework TextDragon. In TextDragon, a text detector is designed to describe the shape of text with a series of quadrangles, which can handle text of arbitrary shapes. To extract arbitrary text regions from feature maps, we propose a new differentiable operator named RoISlide, which is the key to connect arbitrary shaped text detection and recognition. Based on the extracted features through RoISlide, a CNN and CTC based text recognizer is introduced to make the framework free from labeling the location of characters. The proposed method achieves state-of-the-art performance on two curved text benchmarks CTW1500 and Total-Text, and competitive results on the ICDAR 2015 Dataset.

PDF Abstract
No code implementations yet. Submit your code now

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Text Spotting ICDAR 2015 TextDragon F-measure (%) - Strong Lexicon 82.5 # 14
F-measure (%) - Weak Lexicon 78.3 # 11
F-measure (%) - Generic Lexicon 65.2 # 15
Text Spotting SCUT-CTW1500 TextDragon F-measure (%) - No Lexicon 39.7 # 11
F-Measure (%) - Full Lexicon 72.4 # 10

Methods


No methods listed for this paper. Add relevant methods here