Language modeling is the task of predicting the next word or character in a document.
( Image credit: Exploring the Limits of Language Modeling )
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
On the WMT'16 En-Ro low resource dataset, DeLighT delivers similar performance with 2. 8 times fewer parameters than baseline transformers.
Ranked #1 on Machine Translation on WMT2016 English-French
It has been found that software, like natural language texts, exhibits "naturalness", which can be captured by statistical language models.
Existing tools for Question Answering (QA) have challenges that limit their use in practice.
Overall, our analyses and experiments show that: (i) BERT has knowledge stored in its parameters about the content of books, movies and music; (ii) it has more content-based knowledge than collaborative-based knowledge; and (iii) fails on conversational recommendation when faced with adversarial data.
Experiments show that for low values of k and p in top-k and top-p sampling, perplexity drops significantly with generated text length, which is also correlated with excessive repetitions in the text (the boredom trap).
In this paper, we present our approach for sentiment classification on Spanish-English code-mixed social media data in the SemEval-2020 Task 9.
For a range of downstream tasks, we indeed find matching subnetworks at 40% to 90% sparsity.
This paper describes our submissions to SemEval 2020 Task 11: Detection of Propaganda Techniques in News Articles for each of the two subtasks of Span Identification and Technique Classification.
In this work, we formulate cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts.
Reinforcement learning (RL) algorithms typically start tabula rasa, without any prior knowledge of the environment, and without any prior skills.