Part-Of-Speech Tagging
214 papers with code • 15 benchmarks • 26 datasets
Part-of-speech tagging (POS tagging) is the task of tagging a word in a text with its part of speech. A part of speech is a category of words with similar grammatical properties. Common English parts of speech are noun, verb, adjective, adverb, pronoun, preposition, conjunction, etc.
Example:
Vinken | , | 61 | years | old |
---|---|---|---|---|
NNP | , | CD | NNS | JJ |
Libraries
Use these libraries to find Part-Of-Speech Tagging models and implementationsDatasets
Latest papers with no code
Punctuation Restoration Improves Structure Understanding without Supervision
Unsupervised learning objectives like language modeling and de-noising constitute a significant part in producing pre-trained models that perform various downstream applications from natural language understanding to conversational tasks.
A Comprehensive View of the Biases of Toxicity and Sentiment Analysis Methods Towards Utterances with African American English Expressions
One explanation for this bias is that AI models are trained on limited datasets, and using such a term in training data is more likely to appear in a toxic utterance.
Zero Resource Cross-Lingual Part Of Speech Tagging
Our conclusion is that projected alignment data in zero-resource language can be beneficial to predict POS tags.
Part-of-Speech Tagger for Bodo Language using Deep Learning approach
We cover several language models in the experiment to see how well they work in POS tagging tasks.
Make BERT-based Chinese Spelling Check Model Enhanced by Layerwise Attention and Gaussian Mixture Model
Meanwhile, to incorporate implicit hierarchical linguistic knowledge within the encoder, we propose a novel form of n-gram-based layerwise self-attention to generate a multilayer representation.
Identifying Planetary Names in Astronomy Papers: A Multi-Step Approach
The automatic identification of planetary feature names in astronomy publications presents numerous challenges.
Augmenty: A Python Library for Structured Text Augmentation
Augmnety is a Python library for structured text augmentation.
Bit Cipher -- A Simple yet Powerful Word Representation System that Integrates Efficiently with Language Models
While Large Language Models (LLMs) become ever more dominant, classic pre-trained word embeddings sustain their relevance through computational efficiency and nuanced linguistic interpretation.
Colloquial Persian POS (CPPOS) Corpus: A Novel Corpus for Colloquial Persian Part of Speech Tagging
A comparison with another well-known Persian POS corpus named "Bijankhan" and the Persian Hazm POS tool trained on Bijankhan revealed that our model trained on CPPOS outperforms them.
Unsupervised Domain Adaptation using Lexical Transformations and Label Injection for Twitter Data
A large body of literature tries to solve this problem by adapting models trained on the source domain to the target domain.