Language Identification
123 papers with code • 6 benchmarks • 19 datasets
Language identification is the task of determining the language of a text.
Libraries
Use these libraries to find Language Identification models and implementationsDatasets
Most implemented papers
HeLI, a Word-Based Backoff Method for Language Identification
The shared task comprised of a total of 8 tracks, of which we participated in 7.
LanideNN: Multilingual Language Identification on Character Window
In language identification, a common first step in natural language processing, we want to automatically determine the language of some input text.
Discriminating between Similar Languages using Weighted Subword Features
The present contribution revolves around a contrastive subword n-gram model which has been tested in the Discriminating between Similar Languages shared task.
Language Identification Using Deep Convolutional Recurrent Neural Networks
Language Identification (LID) systems are used to classify the spoken language from a given audio sample and are typically the first step for many spoken language processing tasks, such as Automatic Speech Recognition (ASR) systems.
A study of N-gram and Embedding Representations for Native Language Identification
We report on our experiments with N-gram and embedding based feature representations for Native Language Identification (NLI) as a part of the NLI Shared Task 2017 (team name: NLI-ISU).
Improved Text Language Identification for the South African Languages
Virtual assistants and text chatbots have recently been gaining popularity.
Automatic Language Identification in Texts: A Survey
Language identification (LI) is the problem of determining the natural language that a document or part thereof is written in.
What's in a Domain? Learning Domain-Robust Text Representations using Adversarial Training
Most real world language problems require learning from heterogenous corpora, raising the problem of learning robust models which generalise well to both similar (in domain) and dissimilar (out of domain) instances to those seen in training.