Open Information Extraction

60 papers with code • 13 benchmarks • 13 datasets

In natural language processing, open information extraction is the task of generating a structured, machine-readable representation of the information in text, usually in the form of triples or n-ary propositions (Source: Wikipedia).

Syntactic Multi-view Learning for Open Information Extraction

daviddongkc/smile_oie 5 Dec 2022

In this paper, we model both constituency and dependency trees into word-level graphs, and enable neural OpenIE to learn from the syntactic structures.

7
05 Dec 2022

mOKB6: A Multilingual Open Knowledge Base Completion Benchmark

dair-iitd/mokb6 13 Nov 2022

Automated completion of open knowledge bases (Open KBs), which are constructed from triples of the form (subject phrase, relation phrase, object phrase), obtained via open information extraction (Open IE) system, are useful for discovering novel facts that may not be directly present in the text.

4
13 Nov 2022

DetIE: Multilingual Open Information Extraction Inspired by Object Detection

sberbank-ai/DetIE 24 Jun 2022

Our model sets the new state of the art performance of 67. 7% F1 on CaRB evaluated as OIE2016 while being 3. 35x faster at inference than previous state of the art.

19
24 Jun 2022

Multi-View Clustering for Open Knowledge Base Canonicalization

yang233666/cmvc 22 Jun 2022

In this paper, we propose CMVC, a novel unsupervised framework that leverages these two views of knowledge jointly for canonicalizing OKBs without the need of manually annotated labels.

5
22 Jun 2022

DeepStruct: Pretraining of Language Models for Structure Prediction

cgraywang/deepstruct Findings (ACL) 2022

We introduce a method for improving the structural understanding abilities of language models.

77
21 May 2022

CompactIE: Compact Facts in Open Information Extraction

farimafatahi/compactie NAACL 2022

Our experiments on CaRB and Wire57 datasets indicate that CompactIE finds 1. 5x-2x more compact extractions than previous systems, with high precision, establishing a new state-of-the-art performance in OpenIE.

10
05 May 2022

DOM-LM: Learning Generalizable Representations for HTML Documents

Misterion777/DOM-LM 25 Jan 2022

We argue that the text and HTML structure together convey important semantics of the content and therefore warrant a special treatment for their representation learning.

29
25 Jan 2022

Sequence-to-Sequence Models for Extracting Information from Registration and Legal Documents

neuralmind-ai/information-extraction-t5 14 Jan 2022

A typical information extraction pipeline consists of token- or span-level classification models coupled with a series of pre- and post-processing scripts.

10
14 Jan 2022

Open-CyKG: An Open Cyber Threat Intelligence Knowledge Graph

IS5882/Open-CyKG Knowledge-Based Systems 2021

Instant analysis of cybersecurity reports is a fundamental challenge for security experts as an immeasurable amount of cyber information is generated on a daily basis, which necessitates automated information extraction tools to facilitate querying and retrieval of data.

63
05 Dec 2021

Refined Commonsense Knowledge from Large-Scale Web Contents

phongnt570/large-scale-csk-extraction 30 Nov 2021

However, they are restricted in their expressiveness to subject-predicate-object (SPO) triples with simple concepts for S and strings for P and O.

11
30 Nov 2021