Search Results for author: Wei-Lin Chiang

Found 9 papers, 8 papers with code

Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity

1 code implementation22 Apr 2024 Tyler Griggs, Xiaoxuan Liu, Jiaxiang Yu, Doyoung Kim, Wei-Lin Chiang, Alvin Cheung, Ion Stoica

Within this space, we show that there is not a linear relationship between GPU cost and performance, and identify three key LLM service characteristics that significantly affect which GPU type is the most cost effective: model request size, request rate, and latency service-level objective (SLO).

Language Modelling Large Language Model

LLM-Assisted Code Cleaning For Training Accurate Code Generators

no code implementations25 Nov 2023 Naman jain, Tianjun Zhang, Wei-Lin Chiang, Joseph E. Gonzalez, Koushik Sen, Ion Stoica

In this work, we investigate data quality for code and find that making the code more structured and readable leads to improved code generation performance of the system.

Code Generation

Rethinking Benchmark and Contamination for Language Models with Rephrased Samples

1 code implementation8 Nov 2023 Shuo Yang, Wei-Lin Chiang, Lianmin Zheng, Joseph E. Gonzalez, Ion Stoica

Many have raised concerns about the trustworthiness of public benchmarks due to potential contamination in pre-training or fine-tuning datasets.

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

1 code implementation21 Sep 2023 Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Tianle Li, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zhuohan Li, Zi Lin, Eric P. Xing, Joseph E. Gonzalez, Ion Stoica, Hao Zhang

Studying how people interact with large language models (LLMs) in real-world scenarios is increasingly important due to their widespread use in various applications.

Chatbot Instruction Following

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

5 code implementations NeurIPS 2023 Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, Ion Stoica

Evaluating large language model (LLM) based chat assistants is challenging due to their broad capabilities and the inadequacy of existing benchmarks in measuring human preferences.

Chatbot Language Modelling +2

Balsa: Learning a Query Optimizer Without Expert Demonstrations

1 code implementation5 Jan 2022 Zongheng Yang, Wei-Lin Chiang, Sifei Luan, Gautam Mittal, Michael Luo, Ion Stoica

Query optimizers are a performance-critical component in every database system.

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

6 code implementations KDD 2019 Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, Cho-Jui Hsieh

Furthermore, Cluster-GCN allows us to train much deeper GCN without much time and memory overhead, which leads to improved prediction accuracy---using a 5-layer Cluster-GCN, we achieve state-of-the-art test F1 score 99. 36 on the PPI dataset, while the previous best result was 98. 71 by [16].

Clustering Computational Efficiency +4

Cannot find the paper you are looking for? You can Submit a new open access paper.