Search Results for author: Zebin Yang

Found 10 papers, 4 papers with code

FastQuery: Communication-efficient Embedding Table Query for Private LLM Inference

no code implementations • 25 May 2024 • Chenqi Lin, Tianshi Xu, Zebin Yang, Runsheng Wang, Ru Huang, Meng Li

We observe the overhead mainly comes from the neglect of 1) the one-hot nature of user queries and 2) the robustness of the embedding table to low bit-width quantization noise.

Quantization

Paper
Add Code

ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding

no code implementations • 21 Feb 2024 • Shuzhang Zhong, Zebin Yang, Meng Li, Ruihao Gong, Runsheng Wang, Ru Huang

Additionally, it introduces a dynamic token tree generation algorithm to balance the computation and parallelism of the verification phase in real-time and maximize the overall efficiency across different batch sizes, sequence lengths, and tasks, etc.

Paper
Add Code

AttentionLego: An Open-Source Building Block For Spatially-Scalable Large Language Model Accelerator With Processing-In-Memory Technology

no code implementations • 21 Jan 2024 • Rongqing Cong, Wenyang He, Mingxuan Li, Bangning Luo, Zebin Yang, Yuchao Yang, Ru Huang, Bonan Yan

Large language models (LLMs) with Transformer architectures have become phenomenal in natural language processing, multimodal generative artificial intelligence, and agent-oriented artificial intelligence.

Language Modelling Large Language Model

Paper
Add Code

PiML Toolbox for Interpretable Machine Learning Model Development and Diagnostics

1 code implementation • 7 May 2023 • Agus Sudjianto, Aijun Zhang, Zebin Yang, Yu Su, Ningzhou Zeng

PiML (read $\pi$-ML, /`pai`em`el/) is an integrated and open-access Python toolbox for interpretable machine learning model development and model diagnostics.

Fairness Interpretable Machine Learning

898

Paper
Code

Explainable Recommendation Systems by Generalized Additive Models with Manifest and Latent Interactions

no code implementations • 15 Dec 2020 • Yifeng Guo, Yu Su, Zebin Yang, Aijun Zhang

In this paper, we propose the explainable recommendation systems based on a generalized additive model with manifest and latent interactions (GAMMLI).

Additive models Collaborative Filtering +2

Paper
Add Code

Unwrapping The Black Box of Deep ReLU Networks: Interpretability, Diagnostics, and Simplification

1 code implementation • 8 Nov 2020 • Agus Sudjianto, William Knauth, Rahul Singh, Zebin Yang, Aijun Zhang

We propose the local linear profile plot and other visualization methods for interpretation and diagnostics, and an effective merging strategy for network simplification.

898

Paper
Code

Hyperparameter Optimization via Sequential Uniform Designs

2 code implementations • 8 Sep 2020 • Zebin Yang, Aijun Zhang

Hyperparameter optimization (HPO) plays a central role in the automated machine learning (AutoML).

Hyperparameter Optimization

Paper
Code

An Effective and Efficient Initialization Scheme for Training Multi-layer Feedforward Neural Networks

no code implementations • 16 May 2020 • Zebin Yang, Hengtao Zhang, Agus Sudjianto, Aijun Zhang

Network initialization is the first and critical step for training neural networks.

Paper
Add Code

GAMI-Net: An Explainable Neural Network based on Generalized Additive Models with Structured Interactions

2 code implementations • 16 Mar 2020 • Zebin Yang, Aijun Zhang, Agus Sudjianto

The lack of interpretability is an inevitable problem when using neural network models in real applications.

Additive models

898

Paper
Code

Enhancing Explainability of Neural Networks through Architecture Constraints

no code implementations • 12 Jan 2019 • Zebin Yang, Aijun Zhang, Agus Sudjianto

It leads to an explainable neural network (xNN) with the superior balance between prediction performance and model interpretability.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.