Document AI

17 papers with code • 1 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Document AI

Trend	Dataset	Best Model	Paper	Code	Compare
	EPHOIE	LayoutLMv3			See all

Libraries

Use these libraries to find Document AI models and implementations

alibabaresearch/advancedliteratemac…

4 papers

994

huggingface/transformers

3 papers

125,862

Datasets

EPHOIE

Subtasks

document understanding

Most implemented papers

Most implemented Social Latest No code

Document Understanding Dataset and Evaluation (DUDE)

rubenpt91/MP-DocVQA-Framework • • ICCV 2023

We call on the Document AI (DocAI) community to reevaluate current methodologies and embrace the challenge of creating more practically-oriented benchmarks.

Paper
Code

Vision Grid Transformer for Document Layout Analysis

alibabaresearch/advancedliteratemachinery • • ICCV 2023

Document pre-trained models and grid-based models have proven to be very effective on various tasks in Document AI.

Paper
Code

Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis

samakos/document-ai- • 29 Aug 2023

In this study, we aim to fill these gaps by conducting a comparative evaluation of state-of-the-art models in document layout analysis and investigating the potential of cross-lingual layout analysis by utilizing machine translation techniques.

Paper
Code

DocXChain: A Powerful Open-Source Toolchain for Document Parsing and Beyond

alibabaresearch/advancedliteratemachinery • • 19 Oct 2023

In this report, we introduce DocXChain, a powerful open-source toolchain for document parsing, which is designed and developed to automatically convert the rich information embodied in unstructured documents, such as text, tables and charts, into structured representations that are readable and manipulable by machines.

Paper
Code

DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading

hint-lab/doctrack • 23 Oct 2023

The use of visually-rich documents (VRDs) in various fields has created a demand for Document AI models that can read and comprehend documents like humans, which requires the overcoming of technical, linguistic, and cognitive barriers.

Paper
Code

DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

zzzhang-jx/docres • • 7 May 2024

This underscores the potential of DocRes across a broader spectrum of document image restoration tasks.

Paper
Code

Document AI

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Document Understanding Dataset and Evaluation (DUDE)

Vision Grid Transformer for Document Layout Analysis

Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis

DocXChain: A Powerful Open-Source Toolchain for Document Parsing and Beyond

DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading

DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

Content

Benchmarks

Add a Result