Search Results for author: Qizheng Zhang

Found 4 papers, 1 papers with code

CacheGen: KV Cache Compression and Streaming for Fast Language Model Serving

1 code implementation11 Oct 2023 YuHan Liu, Hanchen Li, Yihua Cheng, Siddhant Ray, YuYang Huang, Qizheng Zhang, Kuntai Du, Jiayi Yao, Shan Lu, Ganesh Ananthanarayanan, Michael Maire, Henry Hoffmann, Ari Holtzman, Junchen Jiang

Compared to the recent systems that reuse the KV cache, CacheGen reduces the KV cache size by 3. 7-4. 3x and the total delay in fetching and processing contexts by 2. 7-3. 2x while having negligible impact on the LLM response quality in accuracy or perplexity.

Language Modelling Quantization

OneAdapt: Fast Adaptation for Deep Learning Applications via Backpropagation

no code implementations3 Oct 2023 Kuntai Du, YuHan Liu, Yitian Hao, Qizheng Zhang, Haodong Wang, YuYang Huang, Ganesh Ananthanarayanan, Junchen Jiang

While the high demand for network bandwidth and GPU resources could be substantially reduced by optimally adapting the configuration knobs, such as video resolution and frame rate, current adaptation techniques fail to meet three requirements simultaneously: adapt configurations (i) with minimum extra GPU or bandwidth overhead; (ii) to reach near-optimal decisions based on how the data affects the final DNN's accuracy, and (iii) do so for a range of configuration knobs.

object-detection Object Detection

GRACE: Loss-Resilient Real-Time Video through Neural Codecs

no code implementations21 May 2023 Yihua Cheng, Ziyi Zhang, Hanchen Li, Anton Arapin, Yue Zhang, Qizheng Zhang, YuHan Liu, Xu Zhang, Francis Y. Yan, Amrita Mazumdar, Nick Feamster, Junchen Jiang

In real-time video communication, retransmitting lost packets over high-latency networks is not viable due to strict latency requirements.

AccMPEG: Optimizing Video Encoding for Video Analytics

no code implementations26 Apr 2022 Kuntai Du, Qizheng Zhang, Anton Arapin, Haodong Wang, Zhengxu Xia, Junchen Jiang

This paper presents AccMPEG, a new video encoding and streaming system that meets all the three requirements.

object-detection Object Detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.