Sentence segmentation

19 papers with code • 1 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Sentence segmentation

Trend	Dataset	Best Model	Paper	Code	Compare
	UD2.5 test	Trankit			See all

Datasets

Most implemented papers

Most implemented Social Latest No code

Ascle: A Python Natural Language Processing Toolkit for Medical Text Generation

yale-lily/medgen • 28 Nov 2023

This study introduces Ascle, a pioneering natural language processing (NLP) toolkit designed for medical text generation.

Paper
Code

Lexical Semantic Recognition

nert-nlp/streusle • ACL (MWE) 2021

In lexical semantics, full-sentence segmentation and segment labeling of various phenomena are generally treated separately, despite their interdependence.

Paper
Code

Abstractive Summarization of Spoken andWritten Instructions with BERT

nlpyang/PreSumm • • KDD Converse 2020

Summarization of speech is a difficult problem due to the spontaneity of the flow, disfluencies, and other issues that are not usually encountered in written texts.

Paper
Code

Towards JointUD: Part-of-speech Tagging and Lemmatization using Recurrent Neural Networks

YerevaNN/JointUD • • CONLL 2018

This paper describes our submission to CoNLL 2018 UD Shared Task.

Paper
Code

Universal Dependency Parsing from Scratch

stanfordnlp/stanfordnlp • • CONLL 2018

This paper describes Stanford's system at the CoNLL 2018 UD Shared Task.

Paper
Code

Fine-Grained Argument Unit Recognition and Classification

trtm/AURC • • 22 Apr 2019

In this work, we argue that the task should be performed on a more fine-grained level of sequence labeling.

Paper
Code

Using Punkt for Sentence Segmentation in non-Latin Scripts: Experiments on Kurdish (Sorani) Texts

KurdishBLARK/KTC-Segmented • 9 Apr 2020

The Kurdish language is a multi-dialect, under-resourced language which is written in different scripts.

Paper
Code

Abstractive Summarization of Spoken and Written Instructions with BERT

alebryvas/berk266 • • 21 Aug 2020

Summarization of speech is a difficult problem due to the spontaneity of the flow, disfluencies, and other issues that are not usually encountered in written texts.

Paper
Code

Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation

csebuetnlp/banglanmt • • EMNLP 2020

With the segmenter and the two methods combined, we compile a high-quality Bengali-English parallel corpus comprising of 2. 75 million sentence pairs, more than 2 million of which were not available before.

Paper
Code

Evaluating Sentence Segmentation and Word Tokenization Systems on Estonian Web Texts

ksirts/EWTB_sentence_seg • 16 Nov 2020

Texts obtained from web are noisy and do not necessarily follow the orthographic sentence and word boundary rules.

Paper
Code

Sentence segmentation

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result