Search Results for author: Xuefei Ning

Found 36 papers, 15 papers with code

A Survey on Efficient Inference for Large Language Models

no code implementations • 22 Apr 2024 • Zixuan Zhou, Xuefei Ning, Ke Hong, Tianyu Fu, Jiaming Xu, Shiyao Li, Yuming Lou, Luning Wang, Zhihang Yuan, Xiuhong Li, Shengen Yan, Guohao Dai, Xiao-Ping Zhang, Yuhan Dong, Yu Wang

This paper presents a comprehensive survey of the existing literature on efficient LLM inference.

Paper
Add Code

Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better

1 code implementation • 2 Apr 2024 • Enshu Liu, Junyi Zhu, Zinan Lin, Xuefei Ning, Matthew B. Blaschko, Sergey Yekhanin, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang

For example, LCSC achieves better performance using 1 number of function evaluation (NFE) than the base model with 2 NFE on consistency distillation, and decreases the NFE of DM from 15 to 9 while maintaining the generation quality on CIFAR-10.

Paper
Code

FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models

no code implementations • 25 Mar 2024 • Lin Zhao, Tianchen Zhao, Zinan Lin, Xuefei Ning, Guohao Dai, Huazhong Yang, Yu Wang

In recent years, there has been significant progress in the development of text-to-image generative models.

Quantization

Paper
Add Code

Evaluating Quantized Large Language Models

1 code implementation • 28 Feb 2024 • Shiyao Li, Xuefei Ning, Luning Wang, Tengxuan Liu, Xiangsheng Shi, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang

Post-training quantization (PTQ) has emerged as a promising technique to reduce the cost of large language models (LLMs).

Quantization

Paper
Code

LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K

1 code implementation • 6 Feb 2024 • Tao Yuan, Xuefei Ning, Dong Zhou, Zhijie Yang, Shiyao Li, Minghui Zhuang, Zheyue Tan, Zhuyu Yao, Dahua Lin, Boxun Li, Guohao Dai, Shengen Yan, Yu Wang

In contrast, the average context lengths of mainstream benchmarks are insufficient (5k-21k), and they suffer from potential knowledge leakage and inaccurate metrics, resulting in biased evaluation.

16k

Paper
Code

FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs

no code implementations • 8 Jan 2024 • Shulin Zeng, Jun Liu, Guohao Dai, Xinhao Yang, Tianyu Fu, Hongyi Wang, Wenheng Ma, Hanbo Sun, Shiyao Li, Zixiao Huang, Yadong Dai, Jintao Li, Zehao Wang, Ruoyu Zhang, Kairui Wen, Xuefei Ning, Yu Wang

However, existing GPU and transformer-based accelerators cannot efficiently process compressed LLMs, due to the following unresolved challenges: low computational efficiency, underutilized memory bandwidth, and large compilation overheads.

Computational Efficiency Language Modelling +2

Paper
Add Code

A Unified Sampling Framework for Solver Searching of Diffusion Probabilistic Models

no code implementations • 12 Dec 2023 • Enshu Liu, Xuefei Ning, Huazhong Yang, Yu Wang

In this paper, we propose a unified sampling framework (USF) to study the optional strategies for solver.

Paper
Add Code

THInImg: Cross-modal Steganography for Presenting Talking Heads in Images

no code implementations • 28 Nov 2023 • Lin Zhao, Hongxuan Li, Xuefei Ning, Xinru Jiang

Cross-modal Steganography is the practice of concealing secret signals in publicly available cover signals (distinct from the modality of the secret signals) unobtrusively.

Paper
Add Code

Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation

1 code implementation • 28 Jul 2023 • Xuefei Ning, Zinan Lin, Zixuan Zhou, Zifu Wang, Huazhong Yang, Yu Wang

This work aims at decreasing the end-to-end generation latency of large language models (LLMs).

111

Paper
Code

Ada3D : Exploiting the Spatial Redundancy with Adaptive Inference for Efficient 3D Object Detection

no code implementations • ICCV 2023 • Tianchen Zhao, Xuefei Ning, Ke Hong, Zhongyuan Qiu, Pu Lu, Yali Zhao, Linfeng Zhang, Lipu Zhou, Guohao Dai, Huazhong Yang, Yu Wang

One reason for this high resource consumption is the presence of a large number of redundant background points in Lidar point clouds, resulting in spatial redundancy in both 3D voxel and dense BEV map representations.

3D Object Detection Autonomous Driving +1

Paper
Add Code

OMS-DPM: Optimizing the Model Schedule for Diffusion Probabilistic Models

1 code implementation • 15 Jun 2023 • Enshu Liu, Xuefei Ning, Zinan Lin, Huazhong Yang, Yu Wang

Diffusion probabilistic models (DPMs) are a new class of generative models that have achieved state-of-the-art generation quality in various domains.

Paper
Code

Jaccard Metric Losses: Optimizing the Jaccard Index with Soft Labels

2 code implementations • NeurIPS 2023 • Zifu Wang, Xuefei Ning, Matthew B. Blaschko

To address this, we introduce Jaccard Metric Losses (JMLs), which are identical to the soft Jaccard loss in standard settings with hard labels but are fully compatible with soft labels.

Knowledge Distillation Semantic Segmentation

Paper
Code

Dynamic Ensemble of Low-fidelity Experts: Mitigating NAS "Cold-Start"

1 code implementation • 2 Feb 2023 • Junbo Zhao, Xuefei Ning, Enshu Liu, Binxin Ru, Zixuan Zhou, Tianchen Zhao, Chen Chen, Jiajin Zhang, Qingmin Liao, Yu Wang

In the first step, we train different sub-predictors on different types of available low-fidelity information to extract beneficial knowledge as low-fidelity experts.

Neural Architecture Search

Paper
Code

CLOSE: Curriculum Learning On the Sharing Extent Towards Better One-shot NAS

1 code implementation • 16 Jul 2022 • Zixuan Zhou, Xuefei Ning, Yi Cai, Jiashu Han, Yiping Deng, Yuhan Dong, Huazhong Yang, Yu Wang

Specifically, we train the supernet with a large sharing extent (an easier curriculum) at the beginning and gradually decrease the sharing extent of the supernet (a harder curriculum).

Neural Architecture Search

239

Paper
Code

Fault-Tolerant Deep Learning: A Hierarchical Perspective

no code implementations • 5 Apr 2022 • Cheng Liu, Zhen Gao, Siting Liu, Xuefei Ning, Huawei Li, Xiaowei Li

With the rapid advancements of deep learning in the past decade, it can be foreseen that deep learning will be continuously deployed in more and more safety-critical applications such as autonomous driving and robotics.

Autonomous Driving

Paper
Add Code

CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance

no code implementations • CVPR 2022 • Tianchen Zhao, Niansong Zhang, Xuefei Ning, He Wang, Li Yi, Yu Wang

We propose CodedVTR (Codebook-based Voxel TRansformer), which improves data efficiency and generalization ability for 3D sparse voxel transformers.

3D Semantic Segmentation

Paper
Add Code

Multi-Agent Vulnerability Discovery for Autonomous Driving with Hazard Arbitration Reward

no code implementations • 12 Dec 2021 • Weilin Liu, Ye Mu, Chao Yu, Xuefei Ning, Zhong Cao, Yi Wu, Shuang Liang, Huazhong Yang, Yu Wang

These scenarios indeed correspond to the vulnerabilities of the under-test driving policies, thus are meaningful for their further improvements.

Autonomous Driving Multi-agent Reinforcement Learning

Paper
Add Code

BoolNet: Streamlining Binary Neural Networks Using Binary Feature Maps

no code implementations • 29 Sep 2021 • Nianhui Guo, Joseph Bethge, Haojin Yang, Kai Zhong, Xuefei Ning, Christoph Meinel, Yu Wang

Recent works on Binary Neural Networks (BNNs) have made promising progress in narrowing the accuracy gap of BNNs to their 32-bit counterparts, often based on specialized model designs using additional 32-bit components.

Paper
Add Code

BoolNet: Minimizing The Energy Consumption of Binary Neural Networks

1 code implementation • 13 Jun 2021 • Nianhui Guo, Joseph Bethge, Haojin Yang, Kai Zhong, Xuefei Ning, Christoph Meinel, Yu Wang

Recent works on Binary Neural Networks (BNNs) have made promising progress in narrowing the accuracy gap of BNNs to their 32-bit counterparts.

Paper
Code

Ensemble-in-One: Learning Ensemble within Random Gated Networks for Enhanced Adversarial Robustness

no code implementations • AAAI Workshop AdvML 2022 • Yi Cai, Xuefei Ning, Huazhong Yang, Yu Wang

It provides high scalability because the paths within an EIO network exponentially increase with the network depth.

Adversarial Robustness

Paper
Add Code

FedCor: Correlation-Based Active Client Selection Strategy for Heterogeneous Federated Learning

no code implementations • CVPR 2022 • Minxue Tang, Xuefei Ning, Yitu Wang, Jingwei Sun, Yu Wang, Hai Li, Yiran Chen

In this work, we propose FedCor -- an FL framework built on a correlation-based client selection strategy, to boost the convergence rate of FL.

Federated Learning

Paper
Add Code

Machine Learning for Electronic Design Automation: A Survey

1 code implementation • 10 Jan 2021 • Guyue Huang, Jingbo Hu, Yifan He, Jialong Liu, Mingyuan Ma, Zhaoyang Shen, Juejian Wu, Yuanfan Xu, Hengrui Zhang, Kai Zhong, Xuefei Ning, Yuzhe ma, HaoYu Yang, Bei Yu, Huazhong Yang, Yu Wang

With the down-scaling of CMOS technology, the design complexity of very large-scale integrated (VLSI) is increasing.

BIG-bench Machine Learning

Paper
Code

Explore the Potential of CNN Low Bit Training

no code implementations • 1 Jan 2021 • Kai Zhong, Xuefei Ning, Tianchen Zhao, Zhenhua Zhu, Shulin Zeng, Guohao Dai, Yu Wang, Huazhong Yang

Through this dynamic precision framework, we can reduce the bit-width of convolution, which is the most computational cost, while keeping the training process close to the full precision floating-point training.

Quantization

Paper
Add Code

Discovering Robust Convolutional Architecture at Targeted Capacity: A Multi-Shot Approach

1 code implementation • 22 Dec 2020 • Xuefei Ning, Junbo Zhao, Wenshuo Li, Tianchen Zhao, Yin Zheng, Huazhong Yang, Yu Wang

In this paper, considering scenarios with capacity budget, we aim to discover adversarially robust architecture at targeted capacities.

Neural Architecture Search

239

Paper
Code

aw_nas: A Modularized and Extensible NAS framework

1 code implementation • 25 Nov 2020 • Xuefei Ning, Changcheng Tang, Wenshuo Li, Songyi Yang, Tianchen Zhao, Niansong Zhang, Tianyi Lu, Shuang Liang, Huazhong Yang, Yu Wang

Neural Architecture Search (NAS) has received extensive attention due to its capability to discover neural network architectures in an automated manner.

Adversarial Robustness Neural Architecture Search

239

Paper
Code

BARS: Joint Search of Cell Topology and Layout for Accurate and Efficient Binary ARchitectures

no code implementations • 21 Nov 2020 • Tianchen Zhao, Xuefei Ning, Xiangsheng Shi, Songyi Yang, Shuang Liang, Peng Lei, Jianfei Chen, Huazhong Yang, Yu Wang

We also design the micro-level search space to strengthen the information flow for BNN.

Neural Architecture Search

Paper
Add Code

A Surgery of the Neural Architecture Evaluators

no code implementations • 28 Sep 2020 • Xuefei Ning, Wenshuo Li, Zixuan Zhou, Tianchen Zhao, Shuang Liang, Yin Zheng, Huazhong Yang, Yu Wang

A major challenge in NAS is to conduct a fast and accurate evaluation of neural architectures.

Neural Architecture Search

Paper
Add Code

Evaluating Efficient Performance Estimators of Neural Architectures

1 code implementation • NeurIPS 2021 • Xuefei Ning, Changcheng Tang, Wenshuo Li, Zixuan Zhou, Shuang Liang, Huazhong Yang, Yu Wang

Conducting efficient performance estimations of neural architectures is a major challenge in neural architecture search (NAS).

Neural Architecture Search

239

Paper
Code

Physical Adversarial Attack on Vehicle Detector in the Carla Simulator

no code implementations • 31 Jul 2020 • Tong Wu, Xuefei Ning, Wenshuo Li, Ranran Huang, Huazhong Yang, Yu Wang

In this paper, we tackle the issue of physical adversarial examples for object detectors in the wild.

Adversarial Attack

Paper
Add Code

Exploring the Potential of Low-bit Training of Convolutional Neural Networks

no code implementations • 4 Jun 2020 • Kai Zhong, Xuefei Ning, Guohao Dai, Zhenhua Zhu, Tianchen Zhao, Shulin Zeng, Yu Wang, Huazhong Yang

For training a variety of models on CIFAR-10, using 1-bit mantissa and 2-bit exponent is adequate to keep the accuracy loss within $1\%$.

Quantization

Paper
Add Code

DSA: More Efficient Budgeted Pruning via Differentiable Sparsity Allocation

1 code implementation • ECCV 2020 • Xuefei Ning, Tianchen Zhao, Wenshuo Li, Peng Lei, Yu Wang, Huazhong Yang

In budgeted pruning, how to distribute the resources across layers (i. e., sparsity allocation) is the key problem.

Paper
Code

A Generic Graph-based Neural Architecture Encoding Scheme for Predictor-based NAS

1 code implementation • ECCV 2020 • Xuefei Ning, Yin Zheng, Tianchen Zhao, Yu Wang, Huazhong Yang

Experimental results on various search spaces confirm GATES's effectiveness in improving the performance predictor.

Ranked #14 on Neural Architecture Search on CIFAR-10 Image Classification

Neural Architecture Search

239

Paper
Code

FTT-NAS: Discovering Fault-Tolerant Convolutional Neural Architecture

no code implementations • 20 Mar 2020 • Xuefei Ning, Guangjun Ge, Wenshuo Li, Zhenhua Zhu, Yin Zheng, Xiaoming Chen, Zhen Gao, Yu Wang, Huazhong Yang

By inspecting the discovered architectures, we find that the operation primitives, the weight quantization range, the capacity of the model, and the connection pattern have influences on the fault resilience capability of NN models.

Neural Architecture Search Quantization

Paper
Add Code

Nonparametric Topic Modeling with Neural Inference

no code implementations • 18 Jun 2018 • Xuefei Ning, Yin Zheng, Zhuxi Jiang, Yu Wang, Huazhong Yang, Junzhou Huang

Moreover, we also propose HiTM-VAE, where the document-specific topic distributions are generated in a hierarchical manner.

Topic Models

Paper
Add Code

Hu-Fu: Hardware and Software Collaborative Attack Framework against Neural Networks

no code implementations • 14 May 2018 • Wenshuo Li, Jincheng Yu, Xuefei Ning, Pengjun Wang, Qi Wei, Yu Wang, Huazhong Yang

So, in this paper, we propose a hardware-software collaborative attack framework to inject hidden neural network Trojans, which works as a back-door without requiring manipulating input images and is flexible for different scenarios.

Autonomous Driving Cloud Computing +6

Paper
Add Code

A Bayesian Nonparametric Topic Model with Variational Auto-Encoders

no code implementations • ICLR 2018 • Xuefei Ning, Yin Zheng, Zhuxi Jiang, Yu Wang, Huazhong Yang, Junzhou Huang

On the other hand, different with the other BNP topic models, the inference of iTM-VAE is modeled by neural networks, which has rich representation capacity and can be computed in a simple feed-forward manner.

Representation Learning Retrieval +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.