Scene Text Detection

91 papers with code • 9 benchmarks • 15 datasets

Scene Text Detection is a computer vision task that involves automatically identifying and localizing text within natural images or videos. The goal of scene text detection is to develop algorithms that can robustly detect and and label text with bounding boxes in uncontrolled and complex environments, such as street signs, billboards, or license plates.

Source: ContourNet: Taking a Further Step toward Accurate Arbitrary-shaped Scene Text Detection

Benchmarks

Add a Result

These leaderboards are used to track progress in Scene Text Detection

Dataset	Best Model	Compare
ICDAR 2015	TextFuseNet (ResNeXt-101)	See all
Total-Text	MixNet	See all
MSRA-TD500	MixNet	See all
SCUT-CTW1500	MixNet	See all
ICDAR 2013	TextFuseNet (ResNeXt-101)	See all
ICDAR 2017 MLT	PMTD*	See all
COCO-Text	Corner-based Region Proposals	See all
IC19-Art	MixNet	See all
IC19-ReCTs	BDN	See all

Libraries

Use these libraries to find Scene Text Detection models and implementations

PaddlePaddle/PaddleOCR

9 papers

38,632

mindspore-lab/mindocr

7 papers

160

open-mmlab/mmocr

6 papers

4,086

JaidedAI/EasyOCR

3 papers

22,022

See all 8 libraries.

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World

Jyouhou/UnrealText • CVPR 2020

Synthetic data has been a critical tool for training scene text detection and recognition models.

Paper
Code

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition

Bartzi/see • • 14 Dec 2017

Detecting and recognizing text in natural scene images is a challenging, yet not completely solved task.

Paper
Code

Scene Text Detection with Supervised Pyramid Context Network

AirBernard/Scene-Text-Detection-with-SPCNET • • 21 Nov 2018

We propose a supervised pyramid context network (SPCNET) to precisely locate text regions while suppressing false positives.

Paper
Code

ShopSign: a Diverse Scene Text Dataset of Chinese Shop Signs in Street Views

chongshengzhang/shopsign • 25 Mar 2019

Hence, we collect and annotate the ShopSign dataset to advance research in Chinese scene text detection and recognition.

Paper
Code

PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network

PaddlePaddle/PaddleOCR • • 12 Apr 2021

With a PG-CTC decoder, we gather high-level character classification vectors from two-dimensional space and decode them into text symbols without NMS and RoI operations involved, which guarantees high efficiency.

Paper
Code

FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation

czczup/FAST • • 3 Nov 2021

We propose an accurate and efficient scene text detection framework, termed FAST (i. e., faster arbitrarily-shaped text detector).

Paper
Code

SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition

mxin262/swintextspotter • • CVPR 2022

End-to-end scene text spotting has attracted great attention in recent years due to the success of excavating the intrinsic synergy of the scene text detection and recognition.

Paper
Code

Towards End-to-End Unified Scene Text Detection and Layout Analysis

tensorflow/models • • CVPR 2022

In this paper, we bring them together and introduce the task of unified scene text detection and layout analysis.

Paper
Code

Vision-Language Pre-Training for Boosting Scene Text Detectors

AlibabaResearch/AdvancedLiterateMachinery • • CVPR 2022

In this paper, we specifically adapt vision-language joint learning for scene text detection, a task that intrinsically involves cross-modal interaction between the two modalities: vision and language, since text is the written form of language.

Paper
Code

SRFormer: Text Detection Transformer with Incorporated Segmentation and Regression

retsuh-bqw/SRFormer-Text-Det • • 21 Aug 2023

In light of this, we constrain the incorporation of segmentation branches to the first few decoder layers and employ progressive regression refinement in subsequent layers, achieving performance gains while minimizing computational load from the mask. Furthermore, we propose a Mask-informed Query Enhancement module.

Paper
Code

Scene Text Detection

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result