Search Results for author: Jianfei Chen

Found 41 papers, 20 papers with code

SparseDM: Toward Sparse Efficient Diffusion Models

no code implementations • 16 Apr 2024 • Kafeng Wang, Jianfei Chen, He Li, Zhenpeng Mi, Jun Zhu

Diffusion models have been extensively used in data generation tasks and are recognized as one of the best generative models.

Paper
Add Code

Accelerating Transformer Pre-Training with 2:4 Sparsity

no code implementations • 2 Apr 2024 • Yuezhou Hu, Kang Zhao, Weiyu Huang, Jianfei Chen, Jun Zhu

Training large Transformers is slow, but recent innovations on GPU architecture gives us an advantage.

Paper
Add Code

Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization

no code implementations • 19 Mar 2024 • Haocheng Xi, Yuxiang Chen, Kang Zhao, Kaijun Zheng, Jianfei Chen, Jun Zhu

Moreover, for a standard transformer block, our method offers an end-to-end training speedup of 1. 42x and a 1. 49x memory reduction compared to the FP16 baseline.

Quantization

Paper
Add Code

Efficient Backpropagation with Variance-Controlled Adaptive Sampling

1 code implementation • 27 Feb 2024 • Ziteng Wang, Jianfei Chen, Jun Zhu

On all the tasks, VCAS can preserve the original training loss trajectory and validation accuracy with an up to 73. 87% FLOPs reduction of BP and 49. 58% FLOPs reduction of the whole training process.

Paper
Code

C-GAIL: Stabilizing Generative Adversarial Imitation Learning with Control Theory

no code implementations • 26 Feb 2024 • Tianjiao Luo, Tim Pearce, Huayu Chen, Jianfei Chen, Jun Zhu

Generative Adversarial Imitation Learning (GAIL) trains a generative policy to mimic a demonstrator.

Imitation Learning Reinforcement Learning (RL)

Paper
Add Code

DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics

1 code implementation • NeurIPS 2023 • Kaiwen Zheng, Cheng Lu, Jianfei Chen, Jun Zhu

In this work, we propose a novel formulation towards the optimal parameterization during sampling that minimizes the first-order discretization error of the ODE solution.

Image Generation

Paper
Code

Investigating Uncertainty Calibration of Aligned Language Models under the Multiple-Choice Setting

no code implementations • 18 Oct 2023 • Guande He, Peng Cui, Jianfei Chen, WenBo Hu, Jun Zhu

Despite the significant progress made in practical applications of aligned language models (LMs), they tend to be overconfident in output answers compared to the corresponding pre-trained LMs.

Multiple-choice

Paper
Add Code

Training Transformers with 4-bit Integers

1 code implementation • NeurIPS 2023 • Haocheng Xi, Changhao Li, Jianfei Chen, Jun Zhu

To achieve this, we carefully analyze the specific structures of activation and gradients in transformers to propose dedicated quantizers for them.

Image Classification Machine Translation +1

124

Paper
Code

Stabilizing GANs' Training with Brownian Motion Controller

no code implementations • 18 Jun 2023 • Tianjiao Luo, Ziyu Zhu, Jianfei Chen, Jun Zhu

We theoretically prove that the training process of DiracGANs-BMC is globally exponential stable and derive bounds on the rate of convergence.

Paper
Add Code

Preserving Pre-trained Features Helps Calibrate Fine-tuned Language Models

1 code implementation • 30 May 2023 • Guande He, Jianfei Chen, Jun Zhu

In light of these observations, we evaluate the calibration of several methods that preserve pre-trained features and show that preserving pre-trained features can improve the calibration of fine-tuned language models.

Language Modelling Masked Language Modeling +1

Paper
Code

Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs

1 code implementation • 6 May 2023 • Kaiwen Zheng, Cheng Lu, Jianfei Chen, Jun Zhu

The probability flow ordinary differential equation (ODE) of diffusion models (i. e., diffusion ODEs) is a particular case of continuous normalizing flows (CNFs), which enables deterministic inference and exact likelihood evaluation.

Ranked #1 on Image Generation on ImageNet 32x32 (bpd metric)

Image Generation

Paper
Code

Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning

3 code implementations • 25 Apr 2023 • Cheng Lu, Huayu Chen, Jianfei Chen, Hang Su, Chongxuan Li, Jun Zhu

The main challenge for this setting is that the intermediate guidance during the diffusion sampling procedure, which is jointly defined by the sampling distribution and the energy function, is unknown and is hard to estimate.

D4RL Image Generation +1

2,548

Paper
Code

DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models

1 code implementation • 2 Nov 2022 • Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, Jun Zhu

The commonly-used fast sampler for guided sampling is DDIM, a first-order diffusion ODE solver that generally needs 100 to 250 steps for high-quality samples.

Text-to-Image Generation

1,385

Paper
Code

GACT: Activation Compressed Training for Generic Network Architectures

1 code implementation • 22 Jun 2022 • Xiaoxuan Liu, Lianmin Zheng, Dequan Wang, Yukuo Cen, Weize Chen, Xu Han, Jianfei Chen, Zhiyuan Liu, Jie Tang, Joey Gonzalez, Michael Mahoney, Alvin Cheung

Training large neural network (NN) models requires extensive memory resources, and Activation Compressed Training (ACT) is a promising approach to reduce training memory footprint.

Paper
Code

Fast Lossless Neural Compression with Integer-Only Discrete Flows

1 code implementation • 17 Jun 2022 • Siyu Wang, Jianfei Chen, Chongxuan Li, Jun Zhu, Bo Zhang

In this work, we propose Integer-only Discrete Flows (IODF), an efficient neural compressor with integer-only arithmetic.

Quantization

Paper
Code

Maximum Likelihood Training for Score-Based Diffusion ODEs by High-Order Denoising Score Matching

1 code implementation • 16 Jun 2022 • Cheng Lu, Kaiwen Zheng, Fan Bao, Jianfei Chen, Chongxuan Li, Jun Zhu

To fill up this gap, we show that the negative likelihood of the ODE can be bounded by controlling the first, second, and third-order score matching errors; and we further present a novel high-order denoising score matching method to enable maximum likelihood training of score-based diffusion ODEs.

Denoising

Paper
Code

DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps

2 code implementations • 2 Jun 2022 • Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, Jun Zhu

In this work, we propose an exact formulation of the solution of diffusion ODEs.

1,385

Paper
Code

Deep Ensemble as a Gaussian Process Approximate Posterior

no code implementations • 30 Apr 2022 • Zhijie Deng, Feng Zhou, Jianfei Chen, Guoqiang Wu, Jun Zhu

In this way, we relate DE to Bayesian inference to enjoy reliable Bayesian uncertainty.

Bayesian Inference Uncertainty Quantification

Paper
Add Code

Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

1 code implementation • 14 Mar 2022 • Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, Jing Yi, Weilin Zhao, Xiaozhi Wang, Zhiyuan Liu, Hai-Tao Zheng, Jianfei Chen, Yang Liu, Jie Tang, Juanzi Li, Maosong Sun

This necessitates a new branch of research focusing on the parameter-efficient adaptation of PLMs, dubbed as delta tuning in this paper.

Text Classification

938

Paper
Code

Deep Ensemble as a Gaussian Process Posterior

no code implementations • 29 Sep 2021 • Zhijie Deng, Feng Zhou, Jianfei Chen, Guoqiang Wu, Jun Zhu

Deep Ensemble (DE) is a flexible, feasible, and effective alternative to Bayesian neural networks (BNNs) for uncertainty estimation in deep learning.

Variational Inference

Paper
Add Code

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

4 code implementations • 29 Apr 2021 • Jianfei Chen, Lianmin Zheng, Zhewei Yao, Dequan Wang, Ion Stoica, Michael W. Mahoney, Joseph E. Gonzalez

On all these tasks, ActNN compresses the activation to 2 bits on average, with negligible accuracy loss.

Quantization

194

Paper
Code

Implicit Normalizing Flows

1 code implementation • ICLR 2021 • Cheng Lu, Jianfei Chen, Chongxuan Li, Qiuhao Wang, Jun Zhu

Through theoretical analysis, we show that the function space of ImpFlow is strictly richer than that of ResFlows.

Paper
Code

BARS: Joint Search of Cell Topology and Layout for Accurate and Efficient Binary ARchitectures

no code implementations • 21 Nov 2020 • Tianchen Zhao, Xuefei Ning, Xiangsheng Shi, Songyi Yang, Shuang Liang, Peng Lei, Jianfei Chen, Huazhong Yang, Yu Wang

We also design the micro-level search space to strengthen the information flow for BNN.

Neural Architecture Search

Paper
Add Code

A Statistical Framework for Low-bitwidth Training of Deep Neural Networks

1 code implementation • NeurIPS 2020 • Jianfei Chen, Yu Gai, Zhewei Yao, Michael W. Mahoney, Joseph E. Gonzalez

We show that the FQT gradient is an unbiased estimator of the QAT gradient, and we discuss the impact of gradient quantization on its variance.

Ranked #9 on Semantic Textual Similarity on STS Benchmark

Linguistic Acceptability Natural Language Inference +3

Paper
Code

BEV-Seg: Bird's Eye View Semantic Segmentation Using Geometry and Semantic Point Cloud

no code implementations • 19 Jun 2020 • Mong H. Ng, Kaahan Radia, Jianfei Chen, Dequan Wang, Ionel Gog, Joseph E. Gonzalez

Bird's-eye-view (BEV) is a powerful and widely adopted representation for road scenes that captures surrounding objects and their spatial locations, along with overall context in the scene.

Bird's-Eye View Semantic Segmentation Transfer Learning

Paper
Add Code

VFlow: More Expressive Generative Flows with Variational Data Augmentation

1 code implementation • ICML 2020 • Jianfei Chen, Cheng Lu, Biqi Chenli, Jun Zhu, Tian Tian

Generative flows are promising tractable models for density modeling that define probabilistic distributions with invertible transformations.

Ranked #30 on Image Generation on CIFAR-10 (bits/dimension metric)

Density Estimation Image Generation +2

Paper
Code

Stochastic Expectation Maximization with Variance Reduction

1 code implementation • NeurIPS 2018 • Jianfei Chen, Jun Zhu, Yee Whye Teh, Tong Zhang

However, sEM has a slower asymptotic convergence rate than batch EM, and requires a decreasing sequence of step sizes, which is difficult to tune.

Paper
Code

Towards Training Probabilistic Topic Models on Neuromorphic Multi-chip Systems

no code implementations • 10 Apr 2018 • Zihao Xiao, Jianfei Chen, Jun Zhu

We also propose an extension to train pLSI and a method to prune the network to obey the limited fan-in of some NMSs.

Stochastic Optimization Topic Models

Paper
Add Code

Stochastic Training of Graph Convolutional Networks

no code implementations • ICLR 2018 • Jianfei Chen, Jun Zhu

Previous attempts on reducing the receptive field size by subsampling neighbors do not have any convergence guarantee, and their receptive field size per node is still in the order of hundreds.

Paper
Add Code

Population Matching Discrepancy and Applications in Deep Learning

no code implementations • NeurIPS 2017 • Jianfei Chen, Chongxuan Li, Yizhong Ru, Jun Zhu

In this paper, we propose population matching discrepancy (PMD) for estimating the distribution distance based on samples, as well as an algorithm to learn the parameters of the distributions using PMD as an objective.

Domain Adaptation

Paper
Add Code

Stochastic Training of Graph Convolutional Networks with Variance Reduction

2 code implementations • ICML 2018 • Jianfei Chen, Jun Zhu, Le Song

Previous attempts on reducing the receptive field size by subsampling neighbors do not have a convergence guarantee, and their receptive field size per node is still in the order of hundreds.

10,560

Paper
Code

ZhuSuan: A Library for Bayesian Deep Learning

1 code implementation • 18 Sep 2017 • Jiaxin Shi, Jianfei Chen, Jun Zhu, Shengyang Sun, Yucen Luo, Yihong Gu, Yuhao Zhou

In this paper we introduce ZhuSuan, a python probabilistic programming library for Bayesian deep learning, which conjoins the complimentary advantages of Bayesian methods and deep learning.

Probabilistic Programming regression

2,198

Paper
Code

Scalable Inference for Nested Chinese Restaurant Process Topic Models

no code implementations • 23 Feb 2017 • Jianfei Chen, Jun Zhu, Jie Lu, Shixia Liu

Finally, we propose an efficient distributed implementation of PCGS through vectorization, pre-processing, and a careful design of the concurrent data structures and communication strategy.

Topic Models Variational Inference

Paper
Add Code

SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs

no code implementations • 8 Oct 2016 • Kaiwei Li, Jianfei Chen, WenGuang Chen, Jun Zhu

Latent Dirichlet Allocation (LDA) is a popular tool for analyzing discrete count data such as text and images.

Topic Models

Paper
Add Code

Scaling up Dynamic Topic Models

1 code implementation • 19 Feb 2016 • Arnab Bhadury, Jianfei Chen, Jun Zhu, Shixia Liu

Dynamic topic models (DTMs) are very effective in discovering topics and capturing their evolution trends in time series data.

Time Series Time Series Analysis +1

Paper
Code

Streaming Gibbs Sampling for LDA Model

no code implementations • 6 Jan 2016 • Yang Gao, Jianfei Chen, Jun Zhu

Streaming variational Bayes (SVB) is successful in learning LDA models in an online manner.

Paper
Add Code

WarpLDA: a Cache Efficient O(1) Algorithm for Latent Dirichlet Allocation

no code implementations • 29 Oct 2015 • Jianfei Chen, Kaiwei Li, Jun Zhu, WenGuang Chen

We then develop WarpLDA, an LDA sampler which achieves both the best O(1) time complexity per token and the best O(K) scope of random access.

Paper
Add Code

Dropout Training for SVMs with Data Augmentation

no code implementations • 10 Aug 2015 • Ning Chen, Jun Zhu, Jianfei Chen, Ting Chen

Empirical results on several real datasets demonstrate the effectiveness of dropout training on significantly boosting the classification accuracy of both linear and nonlinear SVMs.

Data Augmentation Representation Learning

Paper
Add Code

Big Learning with Bayesian Methods

no code implementations • 24 Nov 2014 • Jun Zhu, Jianfei Chen, Wen-Bo Hu, Bo Zhang

Explosive growth in data and availability of cheap computing resources have sparked increasing interest in Big learning, an emerging subfield that studies scalable machine learning algorithms, systems, and applications with Big Data.

Bayesian Inference BIG-bench Machine Learning +1

Paper
Add Code

Dropout Training for Support Vector Machines

no code implementations • 16 Apr 2014 • Ning Chen, Jun Zhu, Jianfei Chen, Bo Zhang

To deal with the intractable expectation of the non-smooth hinge loss under corrupting distributions, we develop an iteratively re-weighted least square (IRLS) algorithm by exploring data augmentation techniques.

Data Augmentation

Paper
Add Code

Scalable Inference for Logistic-Normal Topic Models

no code implementations • NeurIPS 2013 • Jianfei Chen, Jun Zhu, Zi Wang, Xun Zheng, Bo Zhang

Logistic-normal topic models can effectively discover correlation structures among latent topics.

Data Augmentation Topic Models

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.