Search Results for author: Zihao Ye

Found 12 papers, 8 papers with code

Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

1 code implementation29 Oct 2023 Yilong Zhao, Chien-Yu Lin, Kan Zhu, Zihao Ye, Lequn Chen, Size Zheng, Luis Ceze, Arvind Krishnamurthy, Tianqi Chen, Baris Kasikci

To maximize LLMs' serving throughput, we introduce Atom, a low-bit quantization method that achieves high throughput improvements with negligible accuracy loss.

Quantization Sentiment Analysis

Punica: Multi-Tenant LoRA Serving

1 code implementation28 Oct 2023 Lequn Chen, Zihao Ye, Yongji Wu, Danyang Zhuo, Luis Ceze, Arvind Krishnamurthy

Our scheduler consolidates multi-tenant LoRA serving workloads in a shared GPU cluster.

SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning

2 code implementations11 Jul 2022 Zihao Ye, Ruihang Lai, Junru Shao, Tianqi Chen, Luis Ceze

We propose SparseTIR, a sparse tensor compilation abstraction that offers composable formats and composable transformations for deep learning workloads.

TensorIR: An Abstraction for Automatic Tensorized Program Optimization

2 code implementations9 Jul 2022 Siyuan Feng, Bohan Hou, Hongyi Jin, Wuwei Lin, Junru Shao, Ruihang Lai, Zihao Ye, Lianmin Zheng, Cody Hao Yu, Yong Yu, Tianqi Chen

Finally, we build an end-to-end framework on top of our abstraction to automatically optimize deep learning models for given tensor computation primitives.

BIG-bench Machine Learning

Kokoyi: Executable LaTeX for End-to-end Deep Learning

no code implementations29 Sep 2021 Minjie Wang, Haoming Lu, Yu Gai, Lesheng Jin, Zihao Ye, Zheng Zhang

Despite substantial efforts from the deep learning system community to relieve researchers and practitioners from the burden of implementing models with ever-growing complexity, a considerable lingual gap remains between developing models in the language of mathematics and implementing them in the languages of computer.

Math Translation

FeatGraph: A Flexible and Efficient Backend for Graph Neural Network Systems

no code implementations26 Aug 2020 Yuwei Hu, Zihao Ye, Minjie Wang, Jiali Yu, Da Zheng, Mu Li, Zheng Zhang, Zhiru Zhang, Yida Wang

FeatGraph provides a flexible programming interface to express diverse GNN models by composing coarse-grained sparse templates with fine-grained user-defined functions (UDFs) on each vertex/edge.

DGL-KE: Training Knowledge Graph Embeddings at Scale

1 code implementation18 Apr 2020 Da Zheng, Xiang Song, Chao Ma, Zeyuan Tan, Zihao Ye, Jin Dong, Hao Xiong, Zheng Zhang, George Karypis

Experiments on knowledge graphs consisting of over 86M nodes and 338M edges show that DGL-KE can compute embeddings in 100 minutes on an EC2 instance with 8 GPUs and 30 minutes on an EC2 cluster with 4 machines with 48 cores/machine.

Distributed, Parallel, and Cluster Computing

Transformer on a Diet

1 code implementation14 Feb 2020 Chenguang Wang, Zihao Ye, Aston Zhang, Zheng Zhang, Alexander J. Smola

Transformer has been widely used thanks to its ability to capture sequence information in an efficient way.

Language Modelling

On-line Dialogue Policy Learning with Companion Teaching

no code implementations EACL 2017 Lu Chen, Runzhe Yang, Cheng Chang, Zihao Ye, Xiang Zhou, Kai Yu

On-line dialogue policy learning is the key for building evolvable conversational agent in real world scenarios.

Dialogue Management

Cannot find the paper you are looking for? You can Submit a new open access paper.