Text Segmentation

34 papers with code • 3 benchmarks • 7 datasets

Text segmentation deals with the correct division of a document into semantically coherent blocks.

Most implemented papers

Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Network

gregbugaj/unet-denoiser 12 Jun 2019

For training our network, we develop a cross-entropy based loss function that addresses the imbalance problems.

Crowdsourcing and Aggregating Nested Markable Annotations

juntaoy/dali-preprocessing-pipeline ACL 2019

One of the key steps in language resource creation is the identification of the text segments to be annotated, or markables, which depending on the task may vary from nominal chunks for named entity resolution to (potentially nested) noun phrases in coreference resolution (or mentions) to larger text segments in text segmentation.

Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation

EducationalTestingService/CATS 3 Jan 2020

Breaking down the structure of long texts into semantically coherent segments makes the texts more readable and supports downstream applications like summarization and retrieval.

Text Segmentation by Cross Segment Attention

aakash222/text-segmentation-NLP EMNLP 2020

Document and discourse segmentation are two fundamental NLP tasks pertaining to breaking up text into constituents, which are commonly used to help downstream tasks such as information retrieval or text summarization.

Improving Segmentation for Technical Support Problems

kushalchauhan98/ticket-segmentation ACL 2020

We formulate the problem as a sequence labelling task, and study the performance of state of the art approaches.

Chapter Captor: Text Segmentation in Novels

cpethe/chapter-captor EMNLP 2020

Books are typically segmented into chapters and sections, representing coherent subnarratives and topics.

Interpretable Natural Language Segmentation Based on Link Grammar

aigents/aigents-java-nlp 14 Nov 2020

Natural language segmentation (NLS), or text segmentation, refers to the process of dividing written text into meaningful units.

Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach

SHI-Labs/Rethinking-Text-Segmentation CVPR 2021

We also introduce Text Refinement Network (TexRNet), a novel text segmentation approach that adapts to the unique properties of text, e. g. non-convex boundary, diverse texture, etc., which often impose burdens on traditional segmentation models.

Hierarchical Text Segmentation for Medieval Manuscripts

hazemamir/greedy_text_segmentation COLING 2020

In this paper, we address the segmentation of books of hours, Latin devotional manuscripts of the late Middle Ages, that exhibit challenging issues: a complex hierarchical entangled structure, variable content, noisy transcriptions with no sentence markers, and strong correlations between sections for which topical information is no longer sufficient to draw segmentation boundaries.

Structural Text Segmentation of Legal Documents

dennlinger/TopicalChange 7 Dec 2020

The growing complexity of legal cases has lead to an increasing interest in legal information retrieval systems that can effectively satisfy user-specific information needs.