Search Results for author: Daiyaan Arfeen

Found 4 papers, 3 papers with code

SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification

3 code implementations • 16 May 2023 • Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Zeyu Wang, Zhengxin Zhang, Rae Ying Yee Wong, Alan Zhu, Lijie Yang, Xiaoxiang Shi, Chunan Shi, Zhuoming Chen, Daiyaan Arfeen, Reyna Abhyankar, Zhihao Jia

Our evaluation shows that SpecInfer outperforms existing LLM serving systems by 1. 5-2. 8x for distributed LLM inference and by 2. 6-3. 5x for offloading-based LLM inference, while preserving the same generative performance.

Language Modelling Large Language Model

1,514

Paper
Code

HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks

2 code implementations • NeurIPS 2020 • Zhen Dong, Zhewei Yao, Yaohui Cai, Daiyaan Arfeen, Amir Gholami, Michael W. Mahoney, Kurt Keutzer

However, the search space for a mixed-precision quantization is exponential in the number of layers.

object-detection Object Detection +1

627

Paper
Code

Unsupervised Projection Networks for Generative Adversarial Networks

no code implementations • 30 Sep 2019 • Daiyaan Arfeen, Jesse Zhang

We propose the use of unsupervised learning to train projection networks that project onto the latent space of an already trained generator.

Clustering Image Super-Resolution

Paper
Add Code

Large batch size training of neural networks with adversarial training and second-order information

1 code implementation • ICLR 2019 • Zhewei Yao, Amir Gholami, Daiyaan Arfeen, Richard Liaw, Joseph Gonzalez, Kurt Keutzer, Michael Mahoney

Our method exceeds the performance of existing solutions in terms of both accuracy and the number of SGD iterations (up to 1\% and $5\times$, respectively).

Second-order methods

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.