Key Information Extraction
28 papers with code • 6 benchmarks • 10 datasets
Key Information Extraction (KIE) is aimed at extracting structured information (e.g. key-value pairs) from form-style documents (e.g. invoices), which makes an important step towards intelligent document understanding.
Libraries
Use these libraries to find Key Information Extraction models and implementationsDatasets
Most implemented papers
Key Information Extraction From Documents: Evaluation And Generator
Therefore, natural language processing models have already been combined with computer vision models in the past, to benefit from e. g. positional information and to improve performance of these key information extraction models.
BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents
On the other hand, this paper tackles the problem by going back to the basic: effective combination of text and layout.
MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding
We present MMOCR-an open-source toolbox which provides a comprehensive pipeline for text detection and recognition, as well as their downstream tasks such as named entity recognition and key information extraction.
Doc2Graph: a Task Agnostic Document Understanding Framework based on Graph Neural Networks
Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis.
PP-StructureV2: A Stronger Document Analysis System
For Table Recognition model, we utilize PP-LCNet, CSP-PAN and SLAHead to optimize the backbone module, feature fusion module and decoding module, respectively, which improved the table structure accuracy by 6\% with comparable inference speed.
DoSA : A System to Accelerate Annotations on Business Documents with Human-in-the-Loop
An initial document-specific model can be trained and its inference can be used as feedback for generating more automated annotations.
DocILE Benchmark for Document Information Localization and Extraction
This paper introduces the DocILE benchmark with the largest dataset of business documents for the tasks of Key Information Localization and Extraction and Line Item Recognition.
Form-NLU: Dataset for the Form Natural Language Understanding
Compared to general document analysis tasks, form document structure understanding and retrieval are challenging.
GeoLayoutLM: Geometric Pre-training for Visual Information Extraction
Additionally, novel relation heads, which are pre-trained by the geometric pre-training tasks and fine-tuned for RE, are elaborately designed to enrich and enhance the feature representation.
Information Redundancy and Biases in Public Document Information Extraction Benchmarks
Advances in the Visually-rich Document Understanding (VrDU) field and particularly the Key-Information Extraction (KIE) task are marked with the emergence of efficient Transformer-based approaches such as the LayoutLM models.