Part-Of-Speech Tagging
214 papers with code • 15 benchmarks • 26 datasets
Part-of-speech tagging (POS tagging) is the task of tagging a word in a text with its part of speech. A part of speech is a category of words with similar grammatical properties. Common English parts of speech are noun, verb, adjective, adverb, pronoun, preposition, conjunction, etc.
Example:
Vinken | , | 61 | years | old |
---|---|---|---|---|
NNP | , | CD | NNS | JJ |
Libraries
Use these libraries to find Part-Of-Speech Tagging models and implementationsDatasets
Latest papers
DrBenchmark: A Large Language Understanding Evaluation Benchmark for French Biomedical Domain
This limitation hampers the evaluation of the latest French biomedical models, as they are either assessed on a minimal number of tasks with non-standardized protocols or evaluated using general downstream tasks.
ToPro: Token-Level Prompt Decomposition for Cross-Lingual Sequence Labeling Tasks
However, most previous studies primarily focused on sentence-level classification tasks, and only a few considered token-level labeling tasks such as Named Entity Recognition (NER) and Part-of-Speech (POS) tagging.
Multi-Task Learning for Front-End Text Processing in TTS
We propose a multi-task learning (MTL) model for jointly performing three tasks that are commonly solved in a text-to-speech (TTS) front-end: text normalization (TN), part-of-speech (POS) tagging, and homograph disambiguation (HD).
Def2Vec: Extensible Word Embeddings from Dictionary Definitions
Def2Vec introduces a novel paradigm for word embeddings, leveraging dictionary definitions to learn semantic representations.
PuoBERTa: Training and evaluation of a curated language model for Setswana
Natural language processing (NLP) has made significant progress for well-resourced languages such as English but lagged behind for low-resource languages like Setswana.
The Uncertainty-based Retrieval Framework for Ancient Chinese CWS and POS
Automatic analysis for modern Chinese has greatly improved the accuracy of text mining in related fields, but the study of ancient Chinese is still relatively rare.
RedPenNet for Grammatical Error Correction: Outputs to Tokens, Attentions to Spans
The text editing tasks, including sentence fusion, sentence splitting and rephrasing, text simplification, and Grammatical Error Correction (GEC), share a common trait of dealing with highly similar input and output sequences.
DiscoverPath: A Knowledge Refinement and Retrieval System for Interdisciplinarity on Biomedical Research
The exponential growth in scholarly publications necessitates advanced tools for efficient article retrieval, especially in interdisciplinary fields where diverse terminologies are used to describe similar research.
FonMTL: Towards Multitask Learning for the Fon Language
Multitask learning is a learning paradigm that aims to improve the generalization capacity of a model by sharing knowledge across different but related tasks: this could be prevalent in very data-scarce scenarios.
Advancing Hungarian Text Processing with HuSpaCy: Efficient and Accurate NLP Pipelines
This paper presents a set of industrial-grade text processing models for Hungarian that achieve near state-of-the-art performance while balancing resource efficiency and accuracy.