Search Results for author: Amjad Almahairi

Found 19 papers, 10 papers with code

Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model

no code implementations • 19 Dec 2023 • Shraman Pramanick, Guangxing Han, Rui Hou, Sayan Nag, Ser-Nam Lim, Nicolas Ballas, Qifan Wang, Rama Chellappa, Amjad Almahairi

In this work, we introduce VistaLLM, a powerful visual system that addresses coarse- and fine-grained VL tasks over single and multiple input images using a unified framework.

Attribute Language Modelling +1

Paper
Add Code

Llama 2: Open Foundation and Fine-Tuned Chat Models

14 code implementations • 18 Jul 2023 • Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, Thomas Scialom

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.

Ranked #2 on Question Answering on PubChemQA

Arithmetic Reasoning +5

52,476

Paper
Code

Learning Easily Updated General Purpose Text Representations with Adaptable Task-Specific Prefixes

no code implementations • 22 May 2023 • Kuan-Hao Huang, Liang Tan, Rui Hou, Sinong Wang, Amjad Almahairi, Ruty Rinott

Fine-tuning a large pre-trained language model for each downstream task causes computational burdens in the inference time due to several times of forward passes.

Language Modelling

Paper
Add Code

Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization

1 code implementation • 6 May 2023 • Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike Lewis, Jimmy Ba, Amjad Almahairi

In this work, we introduce Residual Prompt Tuning - a simple and efficient method that significantly improves the performance and stability of prompt tuning.

Paper
Code

Progressive Prompts: Continual Learning for Language Models

2 code implementations • 29 Jan 2023 • Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike Lewis, Amjad Almahairi

We introduce Progressive Prompts - a simple and efficient approach for continual learning in language models.

Continual Learning

Paper
Code

Uniform Masking Prevails in Vision-Language Pretraining

no code implementations • 10 Dec 2022 • Siddharth Verma, Yuchen Lu, Rui Hou, Hanchao Yu, Nicolas Ballas, Madian Khabsa, Amjad Almahairi

Masked Language Modeling (MLM) has proven to be an essential component of Vision-Language (VL) pretraining.

Image-text matching Language Modelling +2

Paper
Add Code

Logical Satisfiability of Counterfactuals for Faithful Explanations in NLI

no code implementations • 25 May 2022 • Suzanna Sia, Anton Belyy, Amjad Almahairi, Madian Khabsa, Luke Zettlemoyer, Lambert Mathias

Evaluating an explanation's faithfulness is desired for many reasons such as trust, interpretability and diagnosing the sources of model's errors.

counterfactual

Paper
Add Code

UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning

1 code implementation • ACL 2022 • Yuning Mao, Lambert Mathias, Rui Hou, Amjad Almahairi, Hao Ma, Jiawei Han, Wen-tau Yih, Madian Khabsa

Recent parameter-efficient language model tuning (PELT) methods manage to match the performance of fine-tuning with much fewer trainable parameters and perform especially well when training data is limited.

Language Modelling Model Selection

Paper
Code

Unsupervised Learning of Dense Visual Representations

no code implementations • NeurIPS 2020 • Pedro O. Pinheiro, Amjad Almahairi, Ryan Y. Benmalek, Florian Golemo, Aaron Courville

VADeR provides a natural representation for dense prediction tasks and transfers well to downstream tasks.

Contrastive Learning Data Augmentation +2

Paper
Add Code

The Impact of Preprocessing on Arabic-English Statistical and Neural Machine Translation

no code implementations • WS 2019 • Mai Oudah, Amjad Almahairi, Nizar Habash

Neural networks have become the state-of-the-art approach for machine translation (MT) in many languages.

Machine Translation Translation

Paper
Add Code

Adversarial Computation of Optimal Transport Maps

1 code implementation • 24 Jun 2019 • Jacob Leygonie, Jennifer She, Amjad Almahairi, Sai Rajeswar, Aaron Courville

We show that during training, our generator follows the $W_2$-geodesic between the initial and the target distributions.

Paper
Code

A Closer Look at the Optimization Landscapes of Generative Adversarial Networks

1 code implementation • ICLR 2020 • Hugo Berard, Gauthier Gidel, Amjad Almahairi, Pascal Vincent, Simon Lacoste-Julien

Generative adversarial networks have been very successful in generative modeling, however they remain relatively challenging to train compared to standard deep neural networks.

Paper
Code

W2GAN: RECOVERING AN OPTIMAL TRANSPORT MAP WITH A GAN

no code implementations • 27 Sep 2018 • Leygonie Jacob*, Jennifer She*, Amjad Almahairi, Sai Rajeswar, Aaron Courville

In this work we address the converse question: is it possible to recover an optimal map in a GAN fashion?

Paper
Add Code

Learning Distributed Representations from Reviews for Collaborative Filtering

no code implementations • 18 Jun 2018 • Amjad Almahairi, Kyle Kastner, Kyunghyun Cho, Aaron Courville

However, interestingly, the greater modeling power offered by the recurrent neural network appears to undermine the model's ability to act as a regularizer of the product representations.

Collaborative Filtering Recommendation Systems

Paper
Add Code

Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data

3 code implementations • ICML 2018 • Amjad Almahairi, Sai Rajeswar, Alessandro Sordoni, Philip Bachman, Aaron Courville

Learning inter-domain mappings from unpaired data can improve performance in structured prediction tasks, such as image segmentation, by reducing the need for paired data.

Image Segmentation Semantic Segmentation +1

131

Paper
Code

Calibrating Energy-based Generative Adversarial Networks

1 code implementation • 6 Feb 2017 • Zihang Dai, Amjad Almahairi, Philip Bachman, Eduard Hovy, Aaron Courville

In this paper, we propose to equip Generative Adversarial Networks with the ability to produce direct energy estimates for samples. Specifically, we propose a flexible adversarial training framework, and prove this framework not only ensures the generator converges to the true data distribution, but also enables the discriminator to retain the density information at the global optimal.

Ranked #17 on Conditional Image Generation on CIFAR-10 (Inception score metric)

Image Generation

Paper
Code

First Result on Arabic Neural Machine Translation

no code implementations • 8 Jun 2016 • Amjad Almahairi, Kyunghyun Cho, Nizar Habash, Aaron Courville

Neural machine translation has become a major alternative to widely used phrase-based statistical machine translation.

Machine Translation Translation

Paper
Add Code

Theano: A Python framework for fast computation of mathematical expressions

1 code implementation • 9 May 2016 • The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang

Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements.

BIG-bench Machine Learning Clustering +2

9,853

Paper
Code

Dynamic Capacity Networks

2 code implementations • 24 Nov 2015 • Amjad Almahairi, Nicolas Ballas, Tim Cooijmans, Yin Zheng, Hugo Larochelle, Aaron Courville

The low-capacity sub-networks are applied across most of the input, but also provide a guide to select a few portions of the input on which to apply the high-capacity sub-networks.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.