UrduDoc Dataset | Papers With Code

Name:*

Full name (optional):

Description (Markdown and $\LaTeX$ enabled):*

The **UrduDoc Dataset** is a benchmark dataset for Urdu text line detection in scanned documents. It is created as a byproduct of the **UTRSet-Real** dataset generation process. Comprising 478 diverse images collected from various sources such as books, documents, manuscripts, and newspapers, it offers a valuable resource for research in Urdu document analysis. It includes 358 pages for training and 120 pages for validation, featuring a wide range of styles, scales, and lighting conditions. It serves as a benchmark for evaluating printed Urdu text detection models, and the benchmark results of state-of-the-art models are provided. The Contour-Net model demonstrates the best performance in terms of h-mean.

The UrduDoc dataset is the first of its kind for printed Urdu text line detection and will advance research in the field. It will be made publicly available for non-commercial, academic, and research purposes upon request and execution of a no-cost license agreement. To request the dataset and for more information and details about the [UrduDoc ](https://paperswithcode.com/dataset/urdudoc), [UTRSet-Real](https://paperswithcode.com/dataset/utrset-real) & [UTRSet-Synth](https://paperswithcode.com/dataset/utrset-synth) datasets, please refer to the [Project Website](https://abdur75648.github.io/UTRNet/) of our paper ["UTRNet: High-Resolution Urdu Text Recognition In Printed Documents"](https://arxiv.org/abs/2306.15782)

Homepage URL (optional):

Paper where the dataset was introduced:

Introduction date:

Dataset license:

URL to full license terms:

Image

Currently

datasets/f2787acf-2c62-4236-89b3-cf18073972ef.png Clear

Change

---

UrduDoc

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

UTRSet-Synth

UTRSet-Real

Usage

License

Modalities

Languages

UrduDoc

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit