Transliteration

45 papers with code • 0 benchmarks • 5 datasets

Transliteration is a mechanism for converting a word in a source (foreign) language to a target language, and often adopts approaches from machine translation. In machine translation, the objective is to preserve the semantic meaning of the utterance as much as possible while following the syntactic structure in the target language. In Transliteration, the objective is to preserve the original pronunciation of the source word as much as possible while following the phonological structures of the target language.

For example, the city’s name “Manchester” has become well known by people of languages other than English. These new words are often named entities that are important in cross-lingual information retrieval, information extraction, machine translation, and often present out-of-vocabulary challenges to spoken language technologies such as automatic speech recognition, spoken keyword search, and text-to-speech.

Source: Phonology-Augmented Statistical Framework for Machine Transliteration using Limited Linguistic Resources

Benchmarks

Add a Result

These leaderboards are used to track progress in Transliteration

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Datasets

Most implemented papers

Most implemented Social Latest No code

Creating a Translation Matrix of the Bible's Names Across 591 Languages

wswu/trabina • LREC 2018

Paper
Code

Creating Large-Scale Multilingual Cognate Tables

wswu/coglust • LREC 2018

Paper
Code

Neural Machine Translation Techniques for Named Entity Transliteration

snukky/news-translit-nmt • WS 2018

Transliterating named entities from one language into another can be approached as neural machine translation (NMT) problem, for which we use deep attentional RNN encoder-decoder models.

Paper
Code

Design Challenges in Named Entity Transliteration

steveash/NETransliteration-COLING2018 • • COLING 2018

We analyze some of the fundamental design challenges that impact the development of a multilingual state-of-the-art named entity transliteration system, including curating bilingual named entity datasets and evaluation of multiple transliteration methods.

Paper
Code

Bootstrapping Transliteration with Constrained Discovery for Low-Resource Languages

shyamupa/hma-translit • • EMNLP 2018

Generating the English transliteration of a name written in a foreign script is an important and challenging step in multilingual knowledge acquisition and information extraction.

Paper
Code

Efficient Sequence Labeling with Actor-Critic Training

SaeedNajafi/ac-tagger • • 30 Sep 2018

We set out to establish RNNs as an attractive alternative to CRFs for sequence labeling.

Paper
Code

A Rule-based Kurdish Text Transliteration System

sinaahmadi/wergor • 26 Nov 2018

In this article, we present a rule-based approach for transliterating two mostly used orthographies in Sorani Kurdish.

Paper
Code

Event detection in Twitter: A keyword volume approach

vsatyav007/repo-eventdetection • 3 Jan 2019

In this paper, we propose an efficient method to select the keywords frequently used in Twitter that are mostly associated with events of interest such as protests.

Paper
Code

ANETAC: Arabic Named Entity Transliteration and Classification Dataset

MohamedHadjAmeur/ANETAC • 6 Jul 2019

The ANETAC dataset is mainly aimed for the researchers that are working on Arabic named entity transliteration, but it can also be used for named entity classification purposes.

Paper
Code

A Multi-cascaded Deep Model for Bilingual SMS Classification

haroonshakeel/bilingual_sms_classification • 29 Nov 2019

Our model achieves high accuracy for classification on this dataset and outperforms the previous model for multilingual text classification, highlighting language independence of McM.

Paper
Code

Transliteration

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result