Search Results for author: Rami Al-Rfou

Found 35 papers, 14 papers with code

Scaling Motion Forecasting Models with Ensemble Distillation

no code implementations5 Apr 2024 Scott Ettinger, Kratarth Goel, Avikalp Srivastava, Rami Al-Rfou

These experiments demonstrate distillation from ensembles as an effective method for improving accuracy of predictive models for robotic systems with limited compute budgets.

Motion Forecasting

Let Your Graph Do the Talking: Encoding Structured Data for LLMs

no code implementations8 Feb 2024 Bryan Perozzi, Bahare Fatemi, Dustin Zelle, Anton Tsitsulin, Mehran Kazemi, Rami Al-Rfou, Jonathan Halcrow

How can we best encode structured data into sequential form for use in large language models (LLMs)?

MotionLM: Multi-Agent Motion Forecasting as Language Modeling

no code implementations ICCV 2023 Ari Seff, Brian Cera, Dian Chen, Mason Ng, Aurick Zhou, Nigamaa Nayakanti, Khaled S. Refaat, Rami Al-Rfou, Benjamin Sapp

Here, we represent continuous trajectories as sequences of discrete motion tokens and cast multi-agent motion prediction as a language modeling task over this domain.

Autonomous Vehicles Language Modelling +2

Wayformer: Motion Forecasting via Simple & Efficient Attention Networks

2 code implementations12 Jul 2022 Nigamaa Nayakanti, Rami Al-Rfou, Aurick Zhou, Kratarth Goel, Khaled S. Refaat, Benjamin Sapp

In this paper, we present Wayformer, a family of attention based architectures for motion forecasting that are simple and homogeneous.

Motion Forecasting Philosophy

Narrowing the Coordinate-frame Gap in Behavior Prediction Models: Distillation for Efficient and Accurate Scene-centric Motion Forecasting

no code implementations8 Jun 2022 DiJia Su, Bertrand Douillard, Rami Al-Rfou, Cheolho Park, Benjamin Sapp

These models are intrinsically invariant to translation and rotation between scene elements, are best-performing on public leaderboards, but scale quadratically with the number of agents and scene elements.

Knowledge Distillation Motion Forecasting +2

VN-Transformer: Rotation-Equivariant Attention for Vector Neurons

no code implementations8 Jun 2022 Serge Assaad, Carlton Downey, Rami Al-Rfou, Nigamaa Nayakanti, Ben Sapp

Rotation equivariance is a desirable property in many practical applications such as motion forecasting and 3D perception, where it can offer benefits like sample efficiency, better generalization, and robustness to input perturbations.

3D Shape Classification Motion Forecasting

SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer

no code implementations ACL 2022 Tu Vu, Brian Lester, Noah Constant, Rami Al-Rfou, Daniel Cer

Finally, we propose an efficient retrieval approach that interprets task prompts as task embeddings to identify similar tasks and predict the most transferable source tasks for a novel target task.

Language Modelling Retrieval +1

nmT5 - Is parallel data still relevant for pre-training massively multilingual language models?

no code implementations ACL 2021 Mihir Kale, Aditya Siddhant, Rami Al-Rfou, Linting Xue, Noah Constant, Melvin Johnson

Recently, mT5 - a massively multilingual version of T5 - leveraged a unified text-to-text format to attain state-of-the-art results on a wide variety of multilingual NLP tasks.

Language Modelling Machine Translation +2

nmT5 -- Is parallel data still relevant for pre-training massively multilingual language models?

no code implementations3 Jun 2021 Mihir Kale, Aditya Siddhant, Noah Constant, Melvin Johnson, Rami Al-Rfou, Linting Xue

Recently, mT5 - a massively multilingual version of T5 - leveraged a unified text-to-text format to attain state-of-the-art results on a wide variety of multilingual NLP tasks.

Language Modelling Machine Translation +2

The Power of Scale for Parameter-Efficient Prompt Tuning

10 code implementations EMNLP 2021 Brian Lester, Rami Al-Rfou, Noah Constant

More remarkably, through ablations on model size using T5, we show that prompt tuning becomes more competitive with scale: as models exceed billions of parameters, our method "closes the gap" and matches the strong performance of model tuning (where all model weights are tuned).

Few-Shot Learning

Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced Language Model Pre-training

1 code implementation NAACL 2021 Oshin Agarwal, Heming Ge, Siamak Shakeri, Rami Al-Rfou

Prior work on Data-To-Text Generation, the task of converting knowledge graph (KG) triples into natural text, focused on domain-specific benchmark datasets.

Data-to-Text Generation Language Modelling +1

mT5: A massively multilingual pre-trained text-to-text transformer

7 code implementations NAACL 2021 Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel

The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks.

Common Sense Reasoning Natural Language Inference +3

Wiki-40B: Multilingual Language Model Dataset

no code implementations LREC 2020 M. Guo, y, Zihang Dai, Vr, Denny e{\v{c}}i{\'c}, Rami Al-Rfou

We released the cleaned-up text of 40+ Wikipedia language editions, the corresponding trained monolingual language models, and several multilingual language models with different fixed vocabulary sizes.

Causal Language Modeling Language Modelling

Bridging the Gap for Tokenizer-Free Language Models

no code implementations27 Aug 2019 Dokook Choe, Rami Al-Rfou, Mandy Guo, Heeyoung Lee, Noah Constant

Purely character-based language models (LMs) have been lagging in quality on large scale datasets, and current state-of-the-art LMs rely on word tokenization.

DDGK: Learning Graph Representations for Deep Divergence Graph Kernels

1 code implementation21 Apr 2019 Rami Al-Rfou, Dustin Zelle, Bryan Perozzi

Second, for each pair of graphs, we train a cross-graph attention network which uses the node representations of an anchor graph to reconstruct another graph.

Feature Engineering Graph Attention +2

A Tutorial on Network Embeddings

2 code implementations8 Aug 2018 Haochen Chen, Bryan Perozzi, Rami Al-Rfou, Steven Skiena

We further demonstrate the applications of network embeddings, and conclude the survey with future work in this area.

Social and Information Networks

Watch Your Step: Learning Node Embeddings via Graph Attention

2 code implementations NeurIPS 2018 Sami Abu-El-Haija, Bryan Perozzi, Rami Al-Rfou, Alex Alemi

Graph embedding methods represent nodes in a continuous vector space, preserving information from the graph (e. g. by sampling random walks).

Graph Attention Graph Embedding +2

Learning Edge Representations via Low-Rank Asymmetric Projections

1 code implementation16 May 2017 Sami Abu-El-Haija, Bryan Perozzi, Rami Al-Rfou

Individually, both of these contributions improve the learned representations, especially when there are memory constraints on the total size of the embeddings.

Link Prediction

Efficient Natural Language Response Suggestion for Smart Reply

no code implementations1 May 2017 Matthew Henderson, Rami Al-Rfou, Brian Strope, Yun-Hsuan Sung, Laszlo Lukacs, Ruiqi Guo, Sanjiv Kumar, Balint Miklos, Ray Kurzweil

This paper presents a computationally efficient machine-learned method for natural language response suggestion.

Visualizing Linguistic Shift

no code implementations20 Nov 2016 Salman Mahmood, Rami Al-Rfou, Klaus Mueller

Neural network based models are a very powerful tool for creating word embeddings, the objective of these models is to group similar words together.

Document Classification Language Modelling +5

A Growing Long-term Episodic & Semantic Memory

no code implementations20 Oct 2016 Marc Pickett, Rami Al-Rfou, Louis Shao, Chris Tar

The long-term memory of most connectionist systems lies entirely in the weights of the system.

Transfer Learning

Conversational Contextual Cues: The Case of Personalization and History for Response Ranking

no code implementations1 Jun 2016 Rami Al-Rfou, Marc Pickett, Javier Snaider, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil

Unlike previous efforts, which focused on modeling messages and responses, we extend the modeling to long context and participant's history.

Theano: A Python framework for fast computation of mathematical expressions

1 code implementation9 May 2016 The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang

Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements.

BIG-bench Machine Learning Clustering +2

Statistically Significant Detection of Linguistic Change

no code implementations12 Nov 2014 Vivek Kulkarni, Rami Al-Rfou, Bryan Perozzi, Steven Skiena

We propose a new computational approach for tracking and detecting statistically significant linguistic shifts in the meaning and usage of words.

Change Point Detection Time Series +1

POLYGLOT-NER: Massive Multilingual Named Entity Recognition

no code implementations14 Oct 2014 Rami Al-Rfou, Vivek Kulkarni, Bryan Perozzi, Steven Skiena

We describe a system that builds Named Entity Recognition (NER) annotators for 40 major languages using Wikipedia and Freebase.

Information Retrieval Machine Translation +7

DeepWalk: Online Learning of Social Representations

14 code implementations26 Mar 2014 Bryan Perozzi, Rami Al-Rfou, Steven Skiena

We present DeepWalk, a novel approach for learning latent representations of vertices in a network.

Anomaly Detection Language Modelling +1

Inducing Language Networks from Continuous Space Word Representations

no code implementations6 Mar 2014 Bryan Perozzi, Rami Al-Rfou, Vivek Kulkarni, Steven Skiena

Recent advancements in unsupervised feature learning have developed powerful latent representations of words.

Polyglot: Distributed Word Representations for Multilingual NLP

no code implementations WS 2013 Rami Al-Rfou, Bryan Perozzi, Steven Skiena

We quantitatively demonstrate the utility of our word embeddings by using them as the sole features for training a part of speech tagger for a subset of these languages.

Language Modelling Multilingual NLP +1

The Expressive Power of Word Embeddings

no code implementations15 Jan 2013 Yanqing Chen, Bryan Perozzi, Rami Al-Rfou, Steven Skiena

We seek to better understand the difference in quality of the several publicly released embeddings.

Benchmarking Sentence +1

Cannot find the paper you are looking for? You can Submit a new open access paper.