Search Results for author: Felix Yu

Found 24 papers, 3 papers with code

Metric-aware LLM inference for regression and scoring

no code implementations • 7 Mar 2024 • Michal Lukasik, Harikrishna Narasimhan, Aditya Krishna Menon, Felix Yu, Sanjiv Kumar

Large language models (LLMs) have demonstrated strong results on a range of NLP tasks.

regression

Paper
Add Code

ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

no code implementations • 15 Dec 2023 • Renat Aksitov, Sobhan Miryoosefi, Zonglin Li, Daliang Li, Sheila Babayan, Kavya Kopparapu, Zachary Fisher, Ruiqi Guo, Sushant Prakash, Pranesh Srinivasan, Manzil Zaheer, Felix Yu, Sanjiv Kumar

Answering complex natural language questions often necessitates multi-step reasoning and integrating external information.

Ranked #1 on Question Answering on Bamboogle

Language Modelling Large Language Model +2

Paper
Add Code

SpecTr: Fast Speculative Decoding via Optimal Transport

no code implementations • NeurIPS 2023 • Ziteng Sun, Ananda Theertha Suresh, Jae Hun Ro, Ahmad Beirami, Himanshu Jain, Felix Yu

We show that the optimal draft selection algorithm (transport plan) can be computed via linear programming, whose best-known runtime is exponential in $k$.

Language Modelling Large Language Model

Paper
Add Code

Large Language Models with Controllable Working Memory

no code implementations • 9 Nov 2022 • Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal Lukasik, Andreas Veit, Felix Yu, Sanjiv Kumar

By contrast, when the context is irrelevant to the task, the model should ignore it and fall back on its internal knowledge.

counterfactual World Knowledge

Paper
Add Code

Two-stage LLM Fine-tuning with Less Specialization and More Generalization

no code implementations • 1 Nov 2022 • Yihan Wang, Si Si, Daliang Li, Michal Lukasik, Felix Yu, Cho-Jui Hsieh, Inderjit S Dhillon, Sanjiv Kumar

Pretrained large language models (LLMs) are general purpose problem solvers applicable to a diverse set of tasks with prompts.

Binary Classification Domain Generalization +5

Paper
Add Code

The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers

no code implementations • 12 Oct 2022 • Zonglin Li, Chong You, Srinadh Bhojanapalli, Daliang Li, Ankit Singh Rawat, Sashank J. Reddi, Ke Ye, Felix Chern, Felix Yu, Ruiqi Guo, Sanjiv Kumar

This paper studies the curious phenomenon for machine learning models with Transformer architectures that their activation maps are sparse.

Paper
Add Code

FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning

1 code implementation • CVPR 2023 • Yuanhao Xiong, Ruochen Wang, Minhao Cheng, Felix Yu, Cho-Jui Hsieh

Federated learning~(FL) has recently attracted increasing attention from academia and industry, with the ultimate goal of achieving collaborative training under privacy and communication constraints.

Federated Learning Image Classification

1,156

Paper
Code

Correlated quantization for distributed mean estimation and optimization

no code implementations • 9 Mar 2022 • Ananda Theertha Suresh, Ziteng Sun, Jae Hun Ro, Felix Yu

We show that applying the proposed protocol as sub-routine in distributed optimization algorithms leads to better convergence rates.

Distributed Optimization Quantization

Paper
Add Code

HD-cos Networks: Efficient Neural Architectures for Secure Multi-Party Computation

no code implementations • 28 Oct 2021 • Wittawat Jitkrittum, Michal Lukasik, Ananda Theertha Suresh, Felix Yu, Gang Wang

In this paper, we study training and inference of neural networks under the MPC setup.

Paper
Add Code

HD-cos Networks: Efficient Neural Architechtures for Secure Multi-Party Computation

no code implementations • 29 Sep 2021 • Wittawat Jitkrittum, Michal Lukasik, Ananda Theertha Suresh, Felix Yu, Gang Wang

In this paper, we study training and inference of neural networks under the MPC setup.

Paper
Add Code

InFillmore: Frame-Guided Language Generation with Bidirectional Context

no code implementations • Joint Conference on Lexical and Computational Semantics 2021 • Jiefu Ou, Nathaniel Weir, Anton Belyy, Felix Yu, Benjamin Van Durme

We propose a structured extension to bidirectional-context conditional language generation, or "infilling," inspired by Frame Semantic theory (Fillmore, 1976).

Text Infilling

Paper
Add Code

Modifying Memories in Transformer Models

no code implementations • 1 Dec 2020 • Chen Zhu, Ankit Singh Rawat, Manzil Zaheer, Srinadh Bhojanapalli, Daliang Li, Felix Yu, Sanjiv Kumar

In this paper, we propose a new task of \emph{explicitly modifying specific factual knowledge in Transformer models while ensuring the model performance does not degrade on the unmodified facts}.

Memorization

Paper
Add Code

Semantic Label Smoothing for Sequence to Sequence Problems

no code implementations • EMNLP 2020 • Michal Lukasik, Himanshu Jain, Aditya Krishna Menon, Seungyeon Kim, Srinadh Bhojanapalli, Felix Yu, Sanjiv Kumar

Label smoothing has been shown to be an effective regularization strategy in classification, that prevents overfitting and helps in label de-noising.

Machine Translation Translation

Paper
Add Code

Learning discrete distributions: user vs item-level privacy

no code implementations • NeurIPS 2020 • Yuhan Liu, Ananda Theertha Suresh, Felix Yu, Sanjiv Kumar, Michael Riley

If each user has $m$ samples, we show that straightforward applications of Laplace or Gaussian mechanisms require the number of users to be $\mathcal{O}(k/(m\alpha^2) + k/\epsilon\alpha)$ to achieve an $\ell_1$ distance of $\alpha$ between the true and estimated distributions, with the privacy-induced penalty $k/\epsilon\alpha$ independent of the number of samples per user $m$.

Federated Learning

Paper
Add Code

Self-supervised Learning for Large-scale Item Recommendations

1 code implementation • 25 Jul 2020 • Tiansheng Yao, Xinyang Yi, Derek Zhiyuan Cheng, Felix Yu, Ting Chen, Aditya Menon, Lichan Hong, Ed H. Chi, Steve Tjoa, Jieqi Kang, Evan Ettinger

Our online results also verify our hypothesis that our framework indeed improves model performance even more on slices that lack supervision.

Data Augmentation Natural Language Understanding +3

310

Paper
Code

Doubly-stochastic mining for heterogeneous retrieval

no code implementations • 23 Apr 2020 • Ankit Singh Rawat, Aditya Krishna Menon, Andreas Veit, Felix Yu, Sashank J. Reddi, Sanjiv Kumar

Modern retrieval problems are characterised by training sets with potentially billions of labels, and heterogeneous data distributions across subpopulations (e. g., users of a retrieval system may be from different countries), each of which poses a challenge.

Retrieval Stochastic Optimization

Paper
Add Code

Take the Scenic Route: Improving Generalization in Vision-and-Language Navigation

no code implementations • 31 Mar 2020 • Felix Yu, Zhiwei Deng, Karthik Narasimhan, Olga Russakovsky

In the Vision-and-Language Navigation (VLN) task, an agent with egocentric vision navigates to a destination given natural language instructions.

Vision and Language Navigation

Paper
Add Code

Sampled Softmax with Random Fourier Features

no code implementations • NeurIPS 2019 • Ankit Singh Rawat, Jiecao Chen, Felix Yu, Ananda Theertha Suresh, Sanjiv Kumar

For the settings where a large number of classes are involved, a common method to speed up training is to sample a subset of classes and utilize an estimate of the loss gradient based on these classes, known as the sampled softmax method.

Paper
Add Code

Hyperparameter Optimization in Black-box Image Processing using Differentiable Proxies

1 code implementation • SIGGRAPH 2019 2019 • Ethan Tseng, Felix Yu, Yuting Yang, Fahim Mannan, Karl St. Arnaud, Derek Nowrouzezahrai, Jean-François Lalonde, Felix Heide

We present a fully automatic system to optimize the parameters of black-box hardware and software image processing pipelines according to any arbitrary (i. e., application-specific) metric.

Hyperparameter Optimization Image Denoising +3

Paper
Code

Stochastic Negative Mining for Learning with Large Output Spaces

no code implementations • 16 Oct 2018 • Sashank J. Reddi, Satyen Kale, Felix Yu, Dan Holtmann-Rice, Jiecao Chen, Sanjiv Kumar

Furthermore, we identify a particularly intuitive class of loss functions in the aforementioned family and show that they are amenable to practical implementation in the large output space setting (i. e. computation is possible without evaluating scores of all labels) by developing a technique called Stochastic Negative Mining.

Retrieval

Paper
Add Code

Loss Decomposition for Fast Learning in Large Output Spaces

no code implementations • ICML 2018 • Ian En-Hsu Yen, Satyen Kale, Felix Yu, Daniel Holtmann-Rice, Sanjiv Kumar, Pradeep Ravikumar

For problems with large output spaces, evaluation of the loss function and its gradient are expensive, typically taking linear time in the size of the output space.

Word Embeddings

Paper
Add Code

cpSGD: Communication-efficient and differentially-private distributed SGD

no code implementations • NeurIPS 2018 • Naman Agarwal, Ananda Theertha Suresh, Felix Yu, Sanjiv Kumar, H. Brendan McMahan

Distributed stochastic gradient descent is an important subroutine in distributed learning.

Privacy Preserving

Paper
Add Code

Multiscale Quantization for Fast Similarity Search

no code implementations • NeurIPS 2017 • Xiang Wu, Ruiqi Guo, Ananda Theertha Suresh, Sanjiv Kumar, Daniel N. Holtmann-Rice, David Simcha, Felix Yu

We propose a multiscale quantization approach for fast similarity search on large, high-dimensional datasets.

Quantization

Paper
Add Code

Lattice Rescoring Strategies for Long Short Term Memory Language Models in Speech Recognition

no code implementations • 15 Nov 2017 • Shankar Kumar, Michael Nirschl, Daniel Holtmann-Rice, Hank Liao, Ananda Theertha Suresh, Felix Yu

Recurrent neural network (RNN) language models (LMs) and Long Short Term Memory (LSTM) LMs, a variant of RNN LMs, have been shown to outperform traditional N-gram LMs on speech recognition tasks.

speech-recognition Speech Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.