Search Results for author: Todor Mihaylov

Found 23 papers, 10 papers with code

Llama 2: Open Foundation and Fine-Tuned Chat Models

14 code implementations • 18 Jul 2023 • Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, Thomas Scialom

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.

Ranked #2 on Question Answering on PubChemQA

Arithmetic Reasoning +5

53,047

Paper
Code

Understanding In-Context Learning via Supportive Pretraining Data

no code implementations • 26 Jun 2023 • Xiaochuang Han, Daniel Simig, Todor Mihaylov, Yulia Tsvetkov, Asli Celikyilmaz, Tianlu Wang

We observe that a continued pretraining on this small subset significantly improves the model's ICL ability, by up to 18%.

In-Context Learning

Paper
Add Code

bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark

2 code implementations • 4 Jun 2023 • Momchil Hardalov, Pepa Atanasova, Todor Mihaylov, Galia Angelova, Kiril Simov, Petya Osenova, Ves Stoyanov, Ivan Koychev, Preslav Nakov, Dragomir Radev

We run the first systematic evaluation of pre-trained language models for Bulgarian, comparing and contrasting results across the nine tasks in the benchmark.

Fact Checking named-entity-recognition +5

Paper
Code

Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training

1 code implementation • CVPR 2023 • Filip Radenovic, Abhimanyu Dubey, Abhishek Kadian, Todor Mihaylov, Simon Vandenhende, Yash Patel, Yi Wen, Vignesh Ramanathan, Dhruv Mahajan

Vision-language models trained with contrastive learning on large-scale noisy data are becoming increasingly popular for zero-shot recognition problems.

Contrastive Learning Text Spotting +1

115

Paper
Code

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

1 code implementation • 22 Dec 2022 • Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, Todor Mihaylov, Daniel Simig, Ping Yu, Kurt Shuster, Tianlu Wang, Qing Liu, Punit Singh Koura, Xian Li, Brian O'Horo, Gabriel Pereyra, Jeff Wang, Christopher Dewan, Asli Celikyilmaz, Luke Zettlemoyer, Ves Stoyanov

To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks.

Ranked #26 on Natural Language Inference on RTE

Language Modelling Meta-Learning +2

Paper
Code

Improving In-Context Few-Shot Learning via Self-Supervised Training

no code implementations • NAACL 2022 • Mingda Chen, Jingfei Du, Ramakanth Pasunuru, Todor Mihaylov, Srini Iyer, Veselin Stoyanov, Zornitsa Kozareva

Self-supervised pretraining has made few-shot learning possible for many NLP tasks.

Few-Shot Learning

Paper
Add Code

OPT: Open Pre-trained Transformer Language Models

7 code implementations • 2 May 2022 • Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, Luke Zettlemoyer

Large language models, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for zero- and few-shot learning.

Ranked #2 on Stereotypical Bias Analysis on CrowS-Pairs

Hate Speech Detection Language Modelling +1

6,387

Paper
Code

Few-shot Learning with Multilingual Language Models

2 code implementations • 20 Dec 2021 • Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, Jingfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona Diab, Veselin Stoyanov, Xian Li

Large-scale generative language models such as GPT-3 are competitive few-shot learners.

Cross-Lingual Transfer Few-Shot Learning +5

29,260

Paper
Code

Efficient Large Scale Language Modeling with Mixtures of Experts

no code implementations • 20 Dec 2021 • Mikel Artetxe, Shruti Bhosale, Naman Goyal, Todor Mihaylov, Myle Ott, Sam Shleifer, Xi Victoria Lin, Jingfei Du, Srinivasan Iyer, Ramakanth Pasunuru, Giri Anantharaman, Xian Li, Shuohui Chen, Halil Akin, Mandeep Baines, Louis Martin, Xing Zhou, Punit Singh Koura, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Mona Diab, Zornitsa Kozareva, Ves Stoyanov

This paper presents a detailed empirical study of how autoregressive MoE language models scale in comparison with dense models in a wide range of settings: in- and out-of-domain language modeling, zero- and few-shot priming, and full-shot fine-tuning.

Language Modelling

Paper
Add Code

SUper Team at SemEval-2016 Task 3: Building a feature-rich system for community question answering

no code implementations • SemEval (ACL) 2016 • Tsvetomila Mihaylova, Pepa Gencheva, Martin Boyanov, Ivana Yovcheva, Todor Mihaylov, Momchil Hardalov, Yasen Kiprov, Daniel Balchev, Ivan Koychev, Preslav Nakov, Ivelina Nikolova, Galia Angelova

We present the system we built for participating in SemEval-2016 Task 3 on Community Question Answering.

Community Question Answering

Paper
Add Code

Exposing Paid Opinion Manipulation Trolls

no code implementations • RANLP 2015 • Todor Mihaylov, Ivan Koychev, Georgi Georgiev, Preslav Nakov

Recently, Web forums have been invaded by opinion manipulation trolls.

Paper
Add Code

EXAMS: A Multi-Subject High School Examinations Dataset for Cross-Lingual and Multilingual Question Answering

2 code implementations • EMNLP 2020 • Momchil Hardalov, Todor Mihaylov, Dimitrina Zlatkova, Yoan Dinkov, Ivan Koychev, Preslav Nakov

We perform various experiments with existing top-performing multilingual pre-trained models and we show that EXAMS offers multiple challenges that require multilingual knowledge and reasoning in multiple domains.

Question Answering Transfer Learning

Paper
Code

SemanticZ at SemEval-2016 Task 3: Ranking Relevant Answers in Community Question Answering Using Semantic Similarity Based on Fine-tuned Word Embeddings

1 code implementation • SEMEVAL 2016 • Todor Mihaylov, Preslav Nakov

We describe our system for finding good answers in a community forum, as defined in SemEval-2016, Task 3 on Community Question Answering.

Community Question Answering Semantic Similarity +2

Paper
Code

Hunting for Troll Comments in News Community Forums

no code implementations • ACL 2016 • Todor Mihaylov, Preslav Nakov

There are different definitions of what a troll is.

Management

Paper
Add Code

Discourse-Aware Semantic Self-Attention for Narrative Reading Comprehension

1 code implementation • IJCNLP 2019 • Todor Mihaylov, Anette Frank

In this work, we propose to use linguistic annotations as a basis for a \textit{Discourse-Aware Semantic Self-Attention} encoder that we employ for reading comprehension on long narrative texts.

Reading Comprehension Sentence

Paper
Code

Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering

1 code implementation • EMNLP 2018 • Todor Mihaylov, Peter Clark, Tushar Khot, Ashish Sabharwal

Our oracle experiments designed to circumvent the knowledge retrieval bottleneck demonstrate the value of both the open book and additional facts.

Ranked #22 on Question Answering on OpenBookQA

Question Answering Retrieval

Paper
Code

Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension with External Commonsense Knowledge

no code implementations • ACL 2018 • Todor Mihaylov, Anette Frank

We introduce a neural reading comprehension model that integrates external commonsense knowledge, encoded as a key-value memory, in a cloze-style setting.

Reading Comprehension

Paper
Add Code

Neural Skill Transfer from Supervised Language Tasks to Reading Comprehension

no code implementations • 10 Nov 2017 • Todor Mihaylov, Zornitsa Kozareva, Anette Frank

Reading comprehension is a challenging task in natural language processing and requires a set of skills to be solved.

General Classification named-entity-recognition +7

Paper
Add Code

Large-Scale Goodness Polarity Lexicons for Community Question Answering

no code implementations • 20 Jul 2017 • Todor Mihaylov, Daniel Belchev, Yasen Kiprov, Ivan Koychev, Preslav Nakov

This leads us to the idea to build a good/bad polarity lexicon as an analogy to the positive/negative sentiment polarity lexicons, commonly used in sentiment analysis.

Community Question Answering Sentiment Analysis