Search Results for author: Todor Mihaylov

Found 23 papers, 10 papers with code

Understanding In-Context Learning via Supportive Pretraining Data

no code implementations26 Jun 2023 Xiaochuang Han, Daniel Simig, Todor Mihaylov, Yulia Tsvetkov, Asli Celikyilmaz, Tianlu Wang

We observe that a continued pretraining on this small subset significantly improves the model's ICL ability, by up to 18%.

In-Context Learning

bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark

2 code implementations4 Jun 2023 Momchil Hardalov, Pepa Atanasova, Todor Mihaylov, Galia Angelova, Kiril Simov, Petya Osenova, Ves Stoyanov, Ivan Koychev, Preslav Nakov, Dragomir Radev

We run the first systematic evaluation of pre-trained language models for Bulgarian, comparing and contrasting results across the nine tasks in the benchmark.

Fact Checking named-entity-recognition +5

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

1 code implementation22 Dec 2022 Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, Todor Mihaylov, Daniel Simig, Ping Yu, Kurt Shuster, Tianlu Wang, Qing Liu, Punit Singh Koura, Xian Li, Brian O'Horo, Gabriel Pereyra, Jeff Wang, Christopher Dewan, Asli Celikyilmaz, Luke Zettlemoyer, Ves Stoyanov

To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks.

Language Modelling Meta-Learning +2

Efficient Large Scale Language Modeling with Mixtures of Experts

no code implementations20 Dec 2021 Mikel Artetxe, Shruti Bhosale, Naman Goyal, Todor Mihaylov, Myle Ott, Sam Shleifer, Xi Victoria Lin, Jingfei Du, Srinivasan Iyer, Ramakanth Pasunuru, Giri Anantharaman, Xian Li, Shuohui Chen, Halil Akin, Mandeep Baines, Louis Martin, Xing Zhou, Punit Singh Koura, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Mona Diab, Zornitsa Kozareva, Ves Stoyanov

This paper presents a detailed empirical study of how autoregressive MoE language models scale in comparison with dense models in a wide range of settings: in- and out-of-domain language modeling, zero- and few-shot priming, and full-shot fine-tuning.

Language Modelling

EXAMS: A Multi-Subject High School Examinations Dataset for Cross-Lingual and Multilingual Question Answering

2 code implementations EMNLP 2020 Momchil Hardalov, Todor Mihaylov, Dimitrina Zlatkova, Yoan Dinkov, Ivan Koychev, Preslav Nakov

We perform various experiments with existing top-performing multilingual pre-trained models and we show that EXAMS offers multiple challenges that require multilingual knowledge and reasoning in multiple domains.

Question Answering Transfer Learning

Discourse-Aware Semantic Self-Attention for Narrative Reading Comprehension

1 code implementation IJCNLP 2019 Todor Mihaylov, Anette Frank

In this work, we propose to use linguistic annotations as a basis for a \textit{Discourse-Aware Semantic Self-Attention} encoder that we employ for reading comprehension on long narrative texts.

Reading Comprehension Sentence

Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering

1 code implementation EMNLP 2018 Todor Mihaylov, Peter Clark, Tushar Khot, Ashish Sabharwal

Our oracle experiments designed to circumvent the knowledge retrieval bottleneck demonstrate the value of both the open book and additional facts.

Question Answering Retrieval

Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension with External Commonsense Knowledge

no code implementations ACL 2018 Todor Mihaylov, Anette Frank

We introduce a neural reading comprehension model that integrates external commonsense knowledge, encoded as a key-value memory, in a cloze-style setting.

Reading Comprehension

Large-Scale Goodness Polarity Lexicons for Community Question Answering

no code implementations20 Jul 2017 Todor Mihaylov, Daniel Belchev, Yasen Kiprov, Ivan Koychev, Preslav Nakov

This leads us to the idea to build a good/bad polarity lexicon as an analogy to the positive/negative sentiment polarity lexicons, commonly used in sentiment analysis.

Community Question Answering Sentiment Analysis

Story Cloze Ending Selection Baselines and Data Examination

no code implementations WS 2017 Todor Mihaylov, Anette Frank

This paper describes two supervised baseline systems for the Story Cloze Test Shared Task (Mostafazadeh et al., 2016a).

Cloze Test Semantic Similarity +2

Cannot find the paper you are looking for? You can Submit a new open access paper.