Search Results for author: Manzil Zaheer

Found 88 papers, 34 papers with code

Scaling Within Document Coreference to Long Texts

no code implementations • Findings (ACL) 2021 • Raghuveer Thirukovalluru, Nicholas Monath, Kumar Shridhar, Manzil Zaheer, Mrinmaya Sachan, Andrew McCallum

Paper
Add Code

Incremental Extractive Opinion Summarization Using Cover Trees

1 code implementation • 16 Jan 2024 • Somnath Basu Roy Chowdhury, Nicholas Monath, Avinava Dubey, Manzil Zaheer, Andrew McCallum, Amr Ahmed, Snigdha Chaturvedi

In this work, we study the task of extractive opinion summarization in an incremental setting, where the underlying review set evolves over time.

Extractive Summarization Opinion Summarization

Paper
Code

ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

no code implementations • 15 Dec 2023 • Renat Aksitov, Sobhan Miryoosefi, Zonglin Li, Daliang Li, Sheila Babayan, Kavya Kopparapu, Zachary Fisher, Ruiqi Guo, Sushant Prakash, Pranesh Srinivasan, Manzil Zaheer, Felix Yu, Sanjiv Kumar

Answering complex natural language questions often necessitates multi-step reasoning and integrating external information.

Ranked #1 on Question Answering on Bamboogle

Language Modelling Large Language Model +2

Paper
Add Code

Functional Interpolation for Relative Positions Improves Long Context Transformers

no code implementations • 6 Oct 2023 • Shanda Li, Chong You, Guru Guruganesh, Joshua Ainslie, Santiago Ontanon, Manzil Zaheer, Sumit Sanghai, Yiming Yang, Sanjiv Kumar, Srinadh Bhojanapalli

Preventing the performance decay of Transformers on inputs longer than those used for training has been an important challenge in extending the context length of these models.

Language Modelling Position

Paper
Add Code

ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis

no code implementations • 26 Jul 2023 • Kensen Shi, Joey Hong, Manzil Zaheer, Pengcheng Yin, Charles Sutton

When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks.

Program Synthesis

Paper
Add Code

Machine Reading Comprehension using Case-based Reasoning

no code implementations • 24 May 2023 • Dung Thai, Dhruv Agarwal, Mudit Chaudhary, Wenlong Zhao, Rajarshi Das, Manzil Zaheer, Jay-Yoon Lee, Hannaneh Hajishirzi, Andrew McCallum

Given a test question, CBR-MRC first retrieves a set of similar cases from a nonparametric memory and then predicts an answer by selecting the span in the test context that is most similar to the contextualized representations of answers in the retrieved cases.

Attribute Machine Reading Comprehension

Paper
Add Code

Can Public Large Language Models Help Private Cross-device Federated Learning?

no code implementations • 20 May 2023 • Boxin Wang, Yibo Jacky Zhang, Yuan Cao, Bo Li, H. Brendan McMahan, Sewoong Oh, Zheng Xu, Manzil Zaheer

We study (differentially) private federated learning (FL) of language models.

Federated Learning

Paper
Add Code

Efficient k-NN Search with Cross-Encoders using Adaptive Multi-Round CUR Decomposition

1 code implementation • 4 May 2023 • Nishant Yadav, Nicholas Monath, Manzil Zaheer, Andrew McCallum

While ANNCUR's one-time selection of anchors tends to approximate the cross-encoder distances on average, doing so forfeits the capacity to accurately estimate distances to items near the query, leading to regret in the crucial end-task: recall of top-k items.

Retrieval

Paper
Code

Improving Dual-Encoder Training through Dynamic Indexes for Negative Mining

no code implementations • 27 Mar 2023 • Nicholas Monath, Manzil Zaheer, Kelsey Allen, Andrew McCallum

First, we introduce an algorithm that uses a tree structure to approximate the softmax with provable bounds and that dynamically maintains the tree.

Retrieval

Paper
Add Code

EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval

no code implementations • 27 Jan 2023 • Seungyeon Kim, Ankit Singh Rawat, Manzil Zaheer, Sadeep Jayasumana, Veeranjaneyulu Sadhanala, Wittawat Jitkrittum, Aditya Krishna Menon, Rob Fergus, Sanjiv Kumar

Large neural models (such as Transformers) achieve state-of-the-art performance for information retrieval (IR).

Information Retrieval Knowledge Distillation +2

Paper
Add Code

Multi-Task Off-Policy Learning from Bandit Feedback

no code implementations • 9 Dec 2022 • Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, Mohammad Ghavamzadeh

We prove per-task bounds on the suboptimality of the learned policies, which show a clear improvement over not using the hierarchical model.

Learning-To-Rank Recommendation Systems

Paper
Add Code

Differentially Private Adaptive Optimization with Delayed Preconditioners

1 code implementation • 1 Dec 2022 • Tian Li, Manzil Zaheer, Ken Ziyu Liu, Sashank J. Reddi, H. Brendan McMahan, Virginia Smith

Privacy noise may negate the benefits of using adaptive optimizers in differentially private model training.

Paper
Code

Large Language Models with Controllable Working Memory

no code implementations • 9 Nov 2022 • Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal Lukasik, Andreas Veit, Felix Yu, Sanjiv Kumar

By contrast, when the context is irrelevant to the task, the model should ignore it and fall back on its internal knowledge.

counterfactual World Knowledge

Paper
Add Code

Learning to Navigate Wikipedia by Taking Random Walks

no code implementations • 31 Oct 2022 • Manzil Zaheer, Kenneth Marino, Will Grathwohl, John Schultz, Wendy Shang, Sheila Babayan, Arun Ahuja, Ishita Dasgupta, Christine Kaeser-Chen, Rob Fergus

A fundamental ability of an intelligent web-based agent is seeking out and acquiring new information.

Fact Verification Navigate +1

Paper
Add Code

Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization

1 code implementation • 23 Oct 2022 • Nishant Yadav, Nicholas Monath, Rico Angell, Manzil Zaheer, Andrew McCallum

When the similarity is measured by dot-product between dual-encoder vectors or $\ell_2$-distance, there already exist many scalable and efficient search methods.

Retrieval

Paper
Code

Longtonotes: OntoNotes with Longer Coreference Chains

1 code implementation • 7 Oct 2022 • Kumar Shridhar, Nicholas Monath, Raghuveer Thirukovalluru, Alessandro Stolfo, Manzil Zaheer, Andrew McCallum, Mrinmaya Sachan

Ontonotes has served as the most important benchmark for coreference resolution.

coreference-resolution

Paper
Code

Generalization Properties of Retrieval-based Models

no code implementations • 6 Oct 2022 • Soumya Basu, Ankit Singh Rawat, Manzil Zaheer

The second class of retrieval-based approaches we explore learns a global model using kernel methods to directly map an input instance and retrieved examples to a prediction, without explicitly solving a local learning task.

Protein Folding Retrieval

Paper
Add Code

A Fourier Approach to Mixture Learning

no code implementations • 5 Oct 2022 • Mingda Qiao, Guru Guruganesh, Ankit Singh Rawat, Avinava Dubey, Manzil Zaheer

Regev and Vijayaraghavan (2017) showed that with $\Delta = \Omega(\sqrt{\log k})$ separation, the means can be learned using $\mathrm{poly}(k, d)$ samples, whereas super-polynomially many samples are required if $\Delta = o(\sqrt{\log k})$ and $d = \Omega(\log k)$.

Paper
Add Code

Teacher Guided Training: An Efficient Framework for Knowledge Transfer

no code implementations • 14 Aug 2022 • Manzil Zaheer, Ankit Singh Rawat, Seungyeon Kim, Chong You, Himanshu Jain, Andreas Veit, Rob Fergus, Sanjiv Kumar

In this paper, we propose the teacher-guided training (TGT) framework for training a high-quality compact model that leverages the knowledge acquired by pretrained generative models, while obviating the need to go through a large volume of data.

Generalization Bounds Image Classification +4

Paper
Add Code

Questions Are All You Need to Train a Dense Passage Retriever

1 code implementation • 21 Jun 2022 • Devendra Singh Sachan, Mike Lewis, Dani Yogatama, Luke Zettlemoyer, Joelle Pineau, Manzil Zaheer

We introduce ART, a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data.

Denoising Language Modelling +1

Paper
Code

StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models

1 code implementation • 23 May 2022 • Adam Liška, Tomáš Kočiský, Elena Gribovskaya, Tayfun Terzi, Eren Sezener, Devang Agrawal, Cyprien de Masson d'Autume, Tim Scholtes, Manzil Zaheer, Susannah Young, Ellen Gilsenan-McMahon, Sophia Austin, Phil Blunsom, Angeliki Lazaridou

Knowledge and language understanding of models evaluated through question answering (QA) has been usually studied on static snapshots of knowledge, like Wikipedia.

Question Answering

Paper
Code

Compositional Generalization and Decomposition in Neural Program Synthesis

no code implementations • 7 Apr 2022 • Kensen Shi, Joey Hong, Manzil Zaheer, Pengcheng Yin, Charles Sutton

We first characterize several different axes along which program synthesis methods would be desired to generalize, e. g., length generalization, or the ability to combine known subroutines in new ways that do not occur in the training data.

Program Synthesis

Paper
Add Code

Knowledge Base Question Answering by Case-based Reasoning over Subgraphs

1 code implementation • 22 Feb 2022 • Rajarshi Das, Ameya Godbole, Ankita Naik, Elliot Tower, Robin Jia, Manzil Zaheer, Hannaneh Hajishirzi, Andrew McCallum

Question answering (QA) over knowledge bases (KBs) is challenging because of the diverse, essentially unbounded, types of reasoning patterns needed.

Knowledge Base Question Answering

Paper
Code

Private Adaptive Optimization with Side Information

1 code implementation • 12 Feb 2022 • Tian Li, Manzil Zaheer, Sashank J. Reddi, Virginia Smith

Adaptive optimization methods have become the default solvers for many machine learning tasks.

Paper
Code

Deep Hierarchy in Bandits

no code implementations • 3 Feb 2022 • Joey Hong, Branislav Kveton, Sumeet Katariya, Manzil Zaheer, Mohammad Ghavamzadeh

We use this exact posterior to analyze the Bayes regret of HierTS in Gaussian bandits.

Thompson Sampling

Paper
Add Code

Robust Training of Neural Networks Using Scale Invariant Architectures

no code implementations • 2 Feb 2022 • Zhiyuan Li, Srinadh Bhojanapalli, Manzil Zaheer, Sashank J. Reddi, Sanjiv Kumar

In contrast to SGD, adaptive gradient methods like Adam allow robust training of modern deep networks, especially large language models.

Paper
Add Code

A Context-Integrated Transformer-Based Neural Network for Auction Design

1 code implementation • 29 Jan 2022 • Zhijian Duan, Jingwu Tang, Yutong Yin, Zhe Feng, Xiang Yan, Manzil Zaheer, Xiaotie Deng

One of the central problems in auction design is developing an incentive-compatible mechanism that maximizes the auctioneer's expected revenue.

Paper
Code

Hierarchical Bayesian Bandits

no code implementations • 12 Nov 2021 • Joey Hong, Branislav Kveton, Manzil Zaheer, Mohammad Ghavamzadeh

We provide a unified view of all these problems, as learning to act in a hierarchical Bayesian bandit.

Federated Learning Thompson Sampling

Paper
Add Code

When in Doubt, Summon the Titans: Efficient Inference with Large Models

no code implementations • 19 Oct 2021 • Ankit Singh Rawat, Manzil Zaheer, Aditya Krishna Menon, Amr Ahmed, Sanjiv Kumar

In a nutshell, we use the large teacher models to guide the lightweight student models to only make correct predictions on a subset of "easy" examples; for the "hard" examples, we fall-back to the teacher.

Image Classification

Paper
Add Code

When in Doubt, Summon the Titans: A Framework for Efficient Inference with Large Models

no code implementations • 29 Sep 2021 • Ankit Singh Rawat, Manzil Zaheer, Aditya Krishna Menon, Amr Ahmed, Sanjiv Kumar

Image Classification

Paper
Add Code

A Field Guide to Federated Optimization

2 code implementations • 14 Jul 2021 • Jianyu Wang, Zachary Charles, Zheng Xu, Gauri Joshi, H. Brendan McMahan, Blaise Aguera y Arcas, Maruan Al-Shedivat, Galen Andrew, Salman Avestimehr, Katharine Daly, Deepesh Data, Suhas Diggavi, Hubert Eichner, Advait Gadhikar, Zachary Garrett, Antonious M. Girgis, Filip Hanzely, Andrew Hard, Chaoyang He, Samuel Horvath, Zhouyuan Huo, Alex Ingerman, Martin Jaggi, Tara Javidi, Peter Kairouz, Satyen Kale, Sai Praneeth Karimireddy, Jakub Konecny, Sanmi Koyejo, Tian Li, Luyang Liu, Mehryar Mohri, Hang Qi, Sashank J. Reddi, Peter Richtarik, Karan Singhal, Virginia Smith, Mahdi Soltanolkotabi, Weikang Song, Ananda Theertha Suresh, Sebastian U. Stich, Ameet Talwalkar, Hongyi Wang, Blake Woodworth, Shanshan Wu, Felix X. Yu, Honglin Yuan, Manzil Zaheer, Mi Zhang, Tong Zhang, Chunxiang Zheng, Chen Zhu, Wennan Zhu

Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection.

Federated Learning

648

Paper
Code

No Regrets for Learning the Prior in Bandits

no code implementations • NeurIPS 2021 • Soumya Basu, Branislav Kveton, Manzil Zaheer, Csaba Szepesvári

We propose ${\tt AdaTS}$, a Thompson sampling algorithm that adapts sequentially to bandit tasks that it interacts with.

Thompson Sampling

Paper
Add Code

Thompson Sampling with a Mixture Prior

no code implementations • 10 Jun 2021 • Joey Hong, Branislav Kveton, Manzil Zaheer, Mohammad Ghavamzadeh, Craig Boutilier

We study Thompson sampling (TS) in online decision making, where the uncertain environment is sampled from a mixture distribution.

Decision Making Multi-Task Learning +3

Paper
Add Code

Case-based Reasoning for Natural Language Queries over Knowledge Bases

no code implementations • EMNLP 2021 • Rajarshi Das, Manzil Zaheer, Dung Thai, Ameya Godbole, Ethan Perez, Jay-Yoon Lee, Lizhen Tan, Lazaros Polymenakos, Andrew McCallum

It is often challenging to solve a complex problem from scratch, but much easier if we can access other similar problems with their solutions -- a paradigm known as case-based reasoning (CBR).

Ranked #2 on Knowledge Base Question Answering on ComplexWebQuestions

Knowledge Base Question Answering Natural Language Queries +1

Paper
Add Code

Exact and Approximate Hierarchical Clustering Using A*

no code implementations • 14 Apr 2021 • Craig S. Greenberg, Sebastian Macaluso, Nicholas Monath, Avinava Dubey, Patrick Flaherty, Manzil Zaheer, Amr Ahmed, Kyle Cranmer, Andrew McCallum

In those cases, hierarchical clustering can be seen as a combinatorial optimization problem.

Clustering Combinatorial Optimization

Paper
Add Code

Model-Agnostic Graph Regularization for Few-Shot Learning

no code implementations • 14 Feb 2021 • Ethan Shen, Maria Brbic, Nicholas Monath, Jiaqi Zhai, Manzil Zaheer, Jure Leskovec

In this paper, we present a comprehensive empirical study on graph embedded few-shot learning.

Ranked #2 on Few-Shot Image Classification on ImageNet-FS (5-shot, novel)

Few-Shot Image Classification Few-Shot Learning

Paper
Add Code

Meta-Thompson Sampling

no code implementations • 11 Feb 2021 • Branislav Kveton, Mikhail Konobeev, Manzil Zaheer, Chih-Wei Hsu, Martin Mladenov, Craig Boutilier, Csaba Szepesvari

Efficient exploration in bandits is a fundamental online learning problem.

Efficient Exploration Meta-Learning +2

Paper
Add Code

Differentiable Meta-Learning of Bandit Policies

no code implementations • NeurIPS 2020 • Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, Manzil Zaheer

Exploration policies in Bayesian bandits maximize the average reward over problem instances drawn from some distribution P. In this work, we learn such policies for an unknown distribution P using samples from P. Our approach is a form of meta-learning and exploits properties of P without making strong assumptions about its form.

Meta-Learning

Paper
Add Code

Non-Stationary Latent Bandits

no code implementations • 1 Dec 2020 • Joey Hong, Branislav Kveton, Manzil Zaheer, Yinlam Chow, Amr Ahmed, Mohammad Ghavamzadeh, Craig Boutilier

The key idea is to frame this problem as a latent bandit, where the prototypical models of user behavior are learned offline and the latent state of the user is inferred online from its interactions with the models.

Recommendation Systems Thompson Sampling

Paper
Add Code

Latent Programmer: Discrete Latent Codes for Program Synthesis

no code implementations • 1 Dec 2020 • Joey Hong, David Dohan, Rishabh Singh, Charles Sutton, Manzil Zaheer

The latent codes are learned using a self-supervised learning principle, in which first a discrete autoencoder is trained on the output sequences, and then the resulting latent codes are used as intermediate targets for the end-to-end sequence prediction task.

Document Summarization Program Synthesis +1

Paper
Add Code

Modifying Memories in Transformer Models

no code implementations • 1 Dec 2020 • Chen Zhu, Ankit Singh Rawat, Manzil Zaheer, Srinadh Bhojanapalli, Daliang Li, Felix Yu, Sanjiv Kumar

In this paper, we propose a new task of \emph{explicitly modifying specific factual knowledge in Transformer models while ensuring the model performance does not degrade on the unmodified facts}.

Memorization

Paper
Add Code

PLLay: Efficient Topological Layer based on Persistent Landscapes

1 code implementation • NeurIPS 2020 • Kwangho Kim, Jisu Kim, Manzil Zaheer, Joon Kim, Frederic Chazal, Larry Wasserman

We propose PLLay, a novel topological layer for general deep learning models based on persistence landscapes, in which we can efficiently exploit the underlying topological features of the input data structure.

Paper
Code

Federated Composite Optimization

1 code implementation • 17 Nov 2020 • Honglin Yuan, Manzil Zaheer, Sashank Reddi

We first show that straightforward extensions of primal algorithms such as FedAvg are not well-suited for FCO since they suffer from the "curse of primal averaging," resulting in poor convergence.

Federated Learning

Paper
Code

Differentiable Open-Ended Commonsense Reasoning

no code implementations • NAACL 2021 • Bill Yuchen Lin, Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Xiang Ren, William W. Cohen

As a step towards making commonsense reasoning research more realistic, we propose to study open-ended commonsense reasoning (OpenCSR) -- the task of answering a commonsense question without any pre-defined choices -- using as a resource only a corpus of commonsense facts written in natural language.

Multiple-choice

Paper
Add Code

Scalable Hierarchical Agglomerative Clustering

2 code implementations • 22 Oct 2020 • Nicholas Monath, Avinava Dubey, Guru Guruganesh, Manzil Zaheer, Amr Ahmed, Andrew McCallum, Gokhan Mergen, Marc Najork, Mert Terzihan, Bryon Tjanaka, YuAn Wang, Yuchen Wu

The applicability of agglomerative clustering, for inferring both hierarchical and flat clustering, is limited by its scalability.

2D Human Pose Estimation Clustering

Paper
Code

Probabilistic Case-based Reasoning for Open-World Knowledge Graph Completion

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Rajarshi Das, Ameya Godbole, Nicholas Monath, Manzil Zaheer, Andrew McCallum

A case-based reasoning (CBR) system solves a new problem by retrieving `cases' that are similar to the given problem.

Ranked #1 on Link Prediction on NELL-995

Knowledge Graph Completion Link Prediction +1

Paper
Code

Unsupervised Abstractive Dialogue Summarization for Tete-a-Tetes

no code implementations • 15 Sep 2020 • Xinyuan Zhang, Ruiyi Zhang, Manzil Zaheer, Amr Ahmed

High-quality dialogue-summary paired data is expensive to produce and domain-sensitive, making abstractive dialogue summarization a challenging task.

Abstractive Dialogue Summarization dialogue summary +2

Paper
Add Code

Revisiting LSTM Networks for Semi-Supervised Text Classification via Mixed Objective Function

1 code implementation • AAAI 2019 2019 • Devendra Singh Sachan, Manzil Zaheer, Ruslan Salakhutdinov

In this paper, we study bidirectional LSTM network for the task of text classification using both supervised and semi-supervised approaches.

Ranked #3 on Text Classification on AG News

General Classification Language Modelling +4

Paper
Code

Big Bird: Transformers for Longer Sequences

11 code implementations • NeurIPS 2020 • Manzil Zaheer, Guru Guruganesh, Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, Amr Ahmed

To remedy this, we propose, BigBird, a sparse attention mechanism that reduces this quadratic dependency to linear.

Ranked #1 on Text Classification on Arxiv HEP-TH citation graph

Linguistic Acceptability Natural Language Inference +3

124,889

Paper
Code

A Simple Approach to Case-Based Reasoning in Knowledge Bases

1 code implementation • AKBC 2020 • Rajarshi Das, Ameya Godbole, Shehzaad Dhuliawala, Manzil Zaheer, Andrew McCallum

We present a surprisingly simple yet accurate approach to reasoning in knowledge graphs (KGs) that requires \emph{no training}, and is reminiscent of case-based reasoning in classical artificial intelligence (AI).

Knowledge Graphs Meta-Learning +1

Paper
Code

Non-Stationary Off-Policy Optimization

no code implementations • 15 Jun 2020 • Joey Hong, Branislav Kveton, Manzil Zaheer, Yin-Lam Chow, Amr Ahmed

This approach is practical and analyzable, and we provide guarantees on both the quality of off-policy optimization and the regret during online deployment.

Multi-Armed Bandits

Paper
Add Code

Latent Bandits Revisited

no code implementations • NeurIPS 2020 • Joey Hong, Branislav Kveton, Manzil Zaheer, Yin-Lam Chow, Amr Ahmed, Craig Boutilier

A latent bandit problem is one in which the learning agent knows the arm reward distributions conditioned on an unknown discrete latent state.

Recommendation Systems Thompson Sampling

Paper
Add Code

Meta-Learning Bandit Policies by Gradient Ascent

no code implementations • 9 Jun 2020 • Branislav Kveton, Martin Mladenov, Chih-Wei Hsu, Manzil Zaheer, Csaba Szepesvari, Craig Boutilier

Most bandit policies are designed to either minimize regret in any problem instance, making very few assumptions about the underlying environment, or in a Bayesian sense, assuming a prior distribution over environment parameters.

Meta-Learning Multi-Armed Bandits

Paper
Add Code

Robust Large-Margin Learning in Hyperbolic Space

no code implementations • NeurIPS 2020 • Melanie Weber, Manzil Zaheer, Ankit Singh Rawat, Aditya Menon, Sanjiv Kumar

In this paper, we present, to our knowledge, the first theoretical guarantees for learning a classifier in hyperbolic rather than Euclidean space.

Representation Learning

Paper
Add Code

Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies

no code implementations • ICLR 2021 • Paul Pu Liang, Manzil Zaheer, Yu-An Wang, Amr Ahmed

In this paper, we design a simple and efficient embedding algorithm that learns a small set of anchor embeddings and a sparse transformation matrix.

Language Modelling Movie Recommendation +2

Paper
Add Code

Adaptive Federated Optimization

5 code implementations • ICLR 2021 • Sashank Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konečný, Sanjiv Kumar, H. Brendan McMahan

Federated learning is a distributed machine learning paradigm in which a large number of clients coordinate with a central server to learn a model without sharing their own training data.

Federated Learning

4,169

Paper
Code

Towards Modular Algorithm Induction

no code implementations • 27 Feb 2020 • Daniel A. Abolafia, Rishabh Singh, Manzil Zaheer, Charles Sutton

Main consists of a neural controller that interacts with a variable-length input tape and learns to compose modules together with their corresponding argument choices.

Reinforcement Learning (RL)

Paper
Add Code

Differentiable Reasoning over a Virtual Knowledge Base

1 code implementation • ICLR 2020 • Bhuwan Dhingra, Manzil Zaheer, Vidhisha Balachandran, Graham Neubig, Ruslan Salakhutdinov, William W. Cohen

In particular, we describe a neural module, DrKIT, that traverses textual data like a KB, softly following paths of relations between mentions of entities in the corpus.

Re-Ranking

1,561

Paper
Code

Differentiable Bandit Exploration

no code implementations • NeurIPS 2020 • Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, Manzil Zaheer

In this work, we learn such policies for an unknown distribution $\mathcal{P}$ using samples from $\mathcal{P}$.

Meta-Learning

Paper
Add Code

PLLay: Efficient Topological Layer based on Persistence Landscapes

2 code implementations • NeurIPS 2020 • Kwangho Kim, Jisu Kim, Manzil Zaheer, Joon Sik Kim, Frederic Chazal, Larry Wasserman

Paper
Code

FedDANE: A Federated Newton-Type Method

2 code implementations • 7 Jan 2020 • Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith

Federated learning aims to jointly learn statistical models over massively distributed remote devices.

Distributed Optimization Federated Learning +1

Paper
Code

Chains-of-Reasoning at TextGraphs 2019 Shared Task: Reasoning over Chains of Facts for Explainable Multi-hop Inference

no code implementations • WS 2019 • Rajarshi Das, Ameya Godbole, Manzil Zaheer, Shehzaad Dhuliawala, Andrew McCallum

This paper describes our submission to the shared task on {``}Multi-hop Inference Explanation Regeneration{''} in TextGraphs workshop at EMNLP 2019 (Jansen and Ustalov, 2019).

Paper
Add Code

Anchor & Transform: Learning Sparse Representations of Discrete Objects

no code implementations • 25 Sep 2019 • Paul Pu Liang, Manzil Zaheer, YuAn Wang, Amr Ahmed

Learning continuous representations of discrete objects such as text, users, and items lies at the heart of many applications including text and user modeling.

Language Modelling text-classification +1

Paper
Add Code

Multi-step Entity-centric Information Retrieval for Multi-Hop Question Answering

no code implementations • WS 2019 • Ameya Godbole, Dilip Kavarthapu, Rajarshi Das, Zhiyu Gong, Abhishek Singhal, Hamed Zamani, Mo Yu, Tian Gao, Xiaoxiao Guo, Manzil Zaheer, Andrew McCallum

Multi-hop question answering (QA) requires an information retrieval (IR) system that can find \emph{multiple} supporting evidence needed to answer the question, making the retrieval process very challenging.

Information Retrieval Multi-hop Question Answering +2

Paper
Add Code

Developing Creative AI to Generate Sculptural Objects

no code implementations • 20 Aug 2019 • Songwei Ge, Austin Dill, Eunsu Kang, Chun-Liang Li, Lingyao Zhang, Manzil Zaheer, Barnabas Poczos

We explore the intersection of human and machine creativity by generating sculptural objects through machine learning.

Clustering Generating 3D Point Clouds

Paper
Add Code

The Myths of Our Time: Fake News

1 code implementation • 5 Aug 2019 • Vít Růžička, Eunsu Kang, David Gordon, Ankita Patel, Jacqui Fashimpaur, Manzil Zaheer

While the purpose of most fake news is misinformation and political propaganda, our team sees it as a new type of myth that is created by people in the age of internet identities and artificial intelligence.

BIG-bench Machine Learning Misinformation +1

Paper
Code

Randomized Exploration in Generalized Linear Bandits

no code implementations • 21 Jun 2019 • Branislav Kveton, Manzil Zaheer, Csaba Szepesvari, Lihong Li, Mohammad Ghavamzadeh, Craig Boutilier

The first, GLM-TSL, samples a generalized linear model (GLM) from the Laplace approximation to the posterior distribution.

Paper
Add Code

Multi-step Retriever-Reader Interaction for Scalable Open-domain Question Answering

1 code implementation • ICLR 2019 • Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew McCallum

This paper introduces a new framework for open-domain question answering in which the retriever and the reader iteratively interact with each other.

Open-Domain Question Answering Reading Comprehension +1

119

Paper
Code

Exchangeable Generative Models with Flow Scans

1 code implementation • 5 Feb 2019 • Christopher Bender, Kevin O'Connor, Yang Li, Juan Jose Garcia, Manzil Zaheer, Junier Oliva

In this work, we develop a new approach to generative density estimation for exchangeable, non-i. i. d.

Density Estimation

Paper
Code

Federated Optimization in Heterogeneous Networks

19 code implementations • 14 Dec 2018 • Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith

Theoretically, we provide convergence guarantees for our framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work (systems heterogeneity).

Distributed Optimization Federated Learning

586

Paper
Code

Adaptive Methods for Nonconvex Optimization

1 code implementation • NeurIPS 2018 • Manzil Zaheer, Sashank Reddi, Devendra Sachan, Satyen Kale, Sanjiv Kumar

In this work, we provide a new analysis of such methods applied to nonconvex stochastic optimization problems, characterizing the effect of increasing minibatch size.

Stochastic Optimization

Paper
Code

Hallucinating Point Cloud into 3D Sculptural Object

no code implementations • 13 Nov 2018 • Chun-Liang Li, Eunsu Kang, Songwei Ge, Lingyao Zhang, Austin Dill, Manzil Zaheer, Barnabas Poczos

Our approach extends DeepDream from images to 3D point clouds.

Object

Paper
Add Code

Point Cloud GAN

1 code implementation • 13 Oct 2018 • Chun-Liang Li, Manzil Zaheer, Yang Zhang, Barnabas Poczos, Ruslan Salakhutdinov

In this paper, we first show a straightforward extension of existing GAN algorithm is not applicable to point clouds, because the constraint required for discriminators is undefined for set data.

Object Recognition

Paper
Code

Towards Gradient Free and Projection Free Stochastic Optimization

no code implementations • 8 Oct 2018 • Anit Kumar Sahu, Manzil Zaheer, Soummya Kar

This paper focuses on the problem of \emph{constrained} \emph{stochastic} optimization.

Stochastic Optimization

Paper
Add Code

Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text

2 code implementations • EMNLP 2018 • Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Kathryn Mazaitis, Ruslan Salakhutdinov, William W. Cohen

In this paper we look at a more practical setting, namely QA over the combination of a KB and entity-linked text, which is appropriate when an incomplete KB is available with a large text corpus.

Graph Representation Learning Open-Domain Question Answering

264

Paper
Code

Nonparametric Density Estimation under Adversarial Losses

no code implementations • NeurIPS 2018 • Shashank Singh, Ananya Uppal, Boyue Li, Chun-Liang Li, Manzil Zaheer, Barnabás Póczos

We study minimax convergence rates of nonparametric density estimation under a large class of loss functions called "adversarial losses", which, besides classical $\mathcal{L}^p$ losses, includes maximum mean discrepancy (MMD), Wasserstein distance, and total variation distance.

Density Estimation

Paper
Add Code

Transformation Autoregressive Networks

no code implementations • ICML 2018 • Junier B. Oliva, Avinava Dubey, Manzil Zaheer, Barnabás Póczos, Ruslan Salakhutdinov, Eric P. Xing, Jeff Schneider

Further, through a comprehensive study over both real world and synthetic data, we show for that jointly leveraging transformations of variables and autoregressive conditional models, results in a considerable improvement in performance.

Ranked #1 on Density Estimation on BSDS300

Density Estimation Outlier Detection

Paper
Add Code

Investigating the Working of Text Classifiers

1 code implementation • COLING 2018 • Devendra Singh Sachan, Manzil Zaheer, Ruslan Salakhutdinov

Text classification is one of the most widely studied tasks in natural language processing.

General Classification text-classification +1

Paper
Code

Compressed Video Action Recognition

1 code implementation • CVPR 2018 • Chao-yuan Wu, Manzil Zaheer, Hexiang Hu, R. Manmatha, Alexander J. Smola, Philipp Krähenbühl

), we propose to train a deep network directly on the compressed video.

Ranked #46 on Action Classification on Charades (using extra training data)

Action Classification Action Recognition +2

495

Paper
Code

State Space LSTM Models with Particle MCMC Inference

no code implementations • ICLR 2018 • Xun Zheng, Manzil Zaheer, Amr Ahmed, Yu-An Wang, Eric P. Xing, Alexander J. Smola

Long Short-Term Memory (LSTM) is one of the most powerful sequence models.

Topic Models

Paper
Add Code

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning

7 code implementations • ICLR 2018 • Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum

Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information.

Navigate Relation +1

309

Paper
Code

A Generic Approach for Escaping Saddle points

no code implementations • 5 Sep 2017 • Sashank J. Reddi, Manzil Zaheer, Suvrit Sra, Barnabas Poczos, Francis Bach, Ruslan Salakhutdinov, Alexander J. Smola

A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points.

Second-order methods

Paper
Add Code

Latent LSTM Allocation: Joint clustering and non-linear dynamic modeling of sequence data

no code implementations • ICML 2017 • Manzil Zaheer, Amr Ahmed, Alexander J. Smola

Recurrent neural networks, such as long-short term memory (LSTM) networks, are powerful tools for modeling sequential data like user browsing history (Tan et al., 2016; Korpusik et al., 2016) or natural language text (Mikolov et al., 2010).

Clustering

Paper
Add Code

Canopy --- Fast Sampling with Cover Trees

no code implementations • ICML 2017 • Manzil Zaheer, Satwik Kottur, Amr Ahmed, José Moura, Alex Smola

In this work, we propose Canopy, a sampler based on Cover Trees that is exact, has guaranteed runtime logarithmic in the number of atoms, and is provably polynomial in the inherent dimensionality of the underlying parameter space.

Paper
Add Code

Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks

no code implementations • ACL 2017 • Rajarshi Das, Manzil Zaheer, Siva Reddy, Andrew McCallum

Existing question answering methods infer answers either from a knowledge base or from raw text.

Question Answering

Paper
Add Code

Spectral Methods for Nonparametric Models

no code implementations • 31 Mar 2017 • Hsiao-Yu Fish Tung, Chao-yuan Wu, Manzil Zaheer, Alexander J. Smola

Nonparametric models are versatile, albeit computationally expensive, tool for modeling mixture models.

Paper
Add Code

Deep Sets

5 code implementations • NeurIPS 2017 • Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan Salakhutdinov, Alexander Smola

Our main theorem characterizes the permutation invariant functions and provides a family of functions to which any permutation invariant objective function must belong.

Anomaly Detection Outlier Detection +1

109

Paper
Code

Gaussian LDA for Topic Models with Word Embeddings

1 code implementation • IJCNLP 2015 • Rajarshi Das, Manzil Zaheer, Chris Dyer

Topic Models Word Embeddings

141

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.