Part-Of-Speech Tagging

214 papers with code • 15 benchmarks • 26 datasets

Part-of-speech tagging (POS tagging) is the task of tagging a word in a text with its part of speech. A part of speech is a category of words with similar grammatical properties. Common English parts of speech are noun, verb, adjective, adverb, pronoun, preposition, conjunction, etc.

Example:

Vinken	,	61	years	old
NNP	,	CD	NNS	JJ

Benchmarks

Add a Result

These leaderboards are used to track progress in Part-Of-Speech Tagging

Dataset	Best Model	Compare
Penn Treebank	SALE-BART encoder	See all
UD	BiLSTM-LAN	See all
Ritter	ACE	See all
Social media	PretRand	See all
ARK	ACE	See all
Tweebank	ACE	See all
UD2.5 test	Trankit	See all
French GSD	CamemBERT	See all
Sequoia Treebank	CamemBERT	See all
Spoken Corpus	CamemBERT	See all
ParTUT	CamemBERT	See all
DaNE	da_dacy_large_tft-0.0.0	See all
XGLUE	mGPT	See all
ANTILLES	Bi-LSTM-CRF + Flair Embeddings + CamemBERT (oscar−138gb−base) Embeddings	See all
Morphosyntactic-analysis-dataset	MyBert	See all

Show all 15 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Part-Of-Speech Tagging models and implementations

jiesutd/NCRFpp

2 papers

1,878

jiesutd/PyTorchSeqLabel

2 papers

1,878

Datasets

Subtasks

Unsupervised Part-Of-Speech Tagging

Latest papers

Most implemented Social Latest No code

DrBenchmark: A Large Language Understanding Evaluation Benchmark for French Biomedical Domain

drbenchmark/drbenchmark • • 20 Feb 2024

This limitation hampers the evaluation of the latest French biomedical models, as they are either assessed on a minimal number of tasks with non-standardized protocols or evaluated using general downstream tasks.

20 Feb 2024

Paper
Code

ToPro: Token-Level Prompt Decomposition for Cross-Lingual Sequence Labeling Tasks

boleima/topro • • 29 Jan 2024

However, most previous studies primarily focused on sentence-level classification tasks, and only a few considered token-level labeling tasks such as Named Entity Recognition (NER) and Part-of-Speech (POS) tagging.

29 Jan 2024

Paper
Code

Multi-Task Learning for Front-End Text Processing in TTS

facebookresearch/llama-hd-dataset • 12 Jan 2024

We propose a multi-task learning (MTL) model for jointly performing three tasks that are commonly solved in a text-to-speech (TTS) front-end: text normalization (TN), part-of-speech (POS) tagging, and homograph disambiguation (HD).

12 Jan 2024

Paper
Code

Def2Vec: Extensible Word Embeddings from Dictionary Definitions

IreneMorazzoni/def_2_vec_irene • ICNLSP 2023

Def2Vec introduces a novel paradigm for word embeddings, leveraging dictionary definitions to learn semantic representations.

16 Dec 2023

Paper
Code

PuoBERTa: Training and evaluation of a curated language model for Setswana

dsfsi/puoberta • 13 Oct 2023

Natural language processing (NLP) has made significant progress for well-resourced languages such as English but lagged behind for low-resource languages like Setswana.

13 Oct 2023

Paper
Code

The Uncertainty-based Retrieval Framework for Ancient Chinese CWS and POS

Jihuai-wpy/bert-ancient-chinese • • LT4HALA (LREC) 2022

Automatic analysis for modern Chinese has greatly improved the accuracy of text mining in related fields, but the study of ancient Chinese is still relatively rare.

12 Oct 2023

Paper
Code

RedPenNet for Grammatical Error Correction: Outputs to Tokens, Attentions to Spans

webspellchecker/unlp-2023-shared-task • 19 Sep 2023

The text editing tasks, including sentence fusion, sentence splitting and rephrasing, text simplification, and Grammatical Error Correction (GEC), share a common trait of dealing with highly similar input and output sequences.

19 Sep 2023

Paper
Code

DiscoverPath: A Knowledge Refinement and Retrieval System for Interdisciplinarity on Biomedical Research

ynchuang/discoverpath • 4 Sep 2023

The exponential growth in scholarly publications necessitates advanced tools for efficient article retrieval, especially in interdisciplinary fields where diverse terminologies are used to describe similar research.

04 Sep 2023

Paper
Code

FonMTL: Towards Multitask Learning for the Fon Language

bonaventuredossou/multitask_fon • • 28 Aug 2023

Multitask learning is a learning paradigm that aims to improve the generalization capacity of a model by sharing knowledge across different but related tasks: this could be prevalent in very data-scarce scenarios.

28 Aug 2023

Paper
Code

Advancing Hungarian Text Processing with HuSpaCy: Efficient and Accurate NLP Pipelines

oroszgy/spacy-hungarian-models • 24 Aug 2023

This paper presents a set of industrial-grade text processing models for Hungarian that achieve near state-of-the-art performance while balancing resource efficiency and accuracy.

148

24 Aug 2023

Paper
Code

Part-Of-Speech Tagging

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result