Search Results for author: Yuhao Zhu

Found 32 papers, 17 papers with code

Characterizing Soft-Error Resiliency in Arm's Ethos-U55 Embedded Machine Learning Accelerator

no code implementations • 14 Apr 2024 • Abhishek Tyagi, Reiley Jeyapaul, Chuteng Zhu, Paul Whatmough, Yuhao Zhu

As Neural Processing Units (NPU) or accelerators are increasingly deployed in a variety of applications including safety critical applications such as autonomous vehicle, and medical imaging, it is critical to understand the fault-tolerance nature of the NPUs.

Autonomous Vehicles Navigate

Paper
Add Code

Multiscale Dynamic Graph Representation for Biometric Recognition with Occlusions

1 code implementation • 27 Jul 2023 • Min Ren, Yunlong Wang, Yuhao Zhu, Kunbo Zhang, Zhenan Sun

Occlusion is a common problem with biometric recognition in the wild.

Graph Matching

Paper
Code

Autonomy 2.0: The Quest for Economies of Scale

no code implementations • 8 Jul 2023 • Shuang Wu, Bo Yu, Shaoshan Liu, Yuhao Zhu

With the advancement of robotics and AI technologies in the past decade, we have now entered the age of autonomous machines.

Autonomous Vehicles

Paper
Add Code

Thales: Formulating and Estimating Architectural Vulnerability Factors for DNN Accelerators

no code implementations • 5 Dec 2022 • Abhishek Tyagi, Yiming Gan, Shaoshan Liu, Bo Yu, Paul Whatmough, Yuhao Zhu

As Deep Neural Networks (DNNs) are increasingly deployed in safety critical and privacy sensitive applications such as autonomous driving and biometric authentication, it is critical to understand the fault-tolerance nature of DNNs.

Autonomous Driving

Paper
Add Code

ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization

1 code implementation • 30 Aug 2022 • Cong Guo, Chen Zhang, Jingwen Leng, Zihan Liu, Fan Yang, Yunxin Liu, Minyi Guo, Yuhao Zhu

In this work, we propose a fixed-length adaptive numerical data type called ANT to achieve low-bit quantization with tiny hardware overheads.

Quantization

Paper
Code

Perturbation Inactivation Based Adversarial Defense for Face Recognition

1 code implementation • 13 Jul 2022 • Min Ren, Yuhao Zhu, Yunlong Wang, Zhenan Sun

A straightforward approach is to inactivate the adversarial perturbations so that they can be easily handled as general perturbations.

Adversarial Attack Adversarial Defense +1

Paper
Code

SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation

1 code implementation • ICLR 2022 • Cong Guo, Yuxian Qiu, Jingwen Leng, Xiaotian Gao, Chen Zhang, Yunxin Liu, Fan Yang, Yuhao Zhu, Minyi Guo

This paper proposes an on-the-fly DFQ framework with sub-second quantization time, called SQuant, which can quantize networks on inference-only devices with low computation and memory requirements.

Data Free Quantization

154

Paper
Code

Block-Skim: Efficient Question Answering for Transformer

1 code implementation • 16 Dec 2021 • Yue Guan, Zhengyi Li, Jingwen Leng, Zhouhan Lin, Minyi Guo, Yuhao Zhu

We further prune the hidden states corresponding to the unnecessary positions early in lower layers, achieving significant inference-time speedup.

Extractive Question-Answering Question Answering

Paper
Code

Dataflow Accelerator Architecture for Autonomous Machine Computing

no code implementations • 15 Sep 2021 • Shaoshan Liu, Yuhao Zhu, Bo Yu, Jean-Luc Gaudiot, Guang R. Gao

Commercial autonomous machines is a thriving sector, one that is likely the next ubiquitous computing platform, after Personal Computers (PC), cloud computing, and mobile computing.

Cloud Computing

Paper
Add Code

S2TA: Exploiting Structured Sparsity for Energy-Efficient Mobile CNN Acceleration

no code implementations • 16 Jul 2021 • Zhi-Gang Liu, Paul N. Whatmough, Yuhao Zhu, Matthew Mattina

We propose to exploit structured sparsity, more specifically, Density Bound Block (DBB) sparsity for both weights and activations.

Paper
Add Code

One Shot Face Swapping on Megapixels

1 code implementation • CVPR 2021 • Yuhao Zhu, Qi Li, Jian Wang, Chengzhong Xu, Zhenan Sun

Extensive experiments demonstrate the superiority of MegaFS and the first megapixel level face swapping database is released for research on DeepFake detection and face image editing in the public domain.

Ranked #8 on Face Swapping on FaceForensics++

DeepFake Detection Disentanglement +2

306

Paper
Code

Fast and Accurate: Video Enhancement using Sparse Depth

no code implementations • 15 Mar 2021 • Yu Feng, Patrick Hansen, Paul N. Whatmough, Guoyu Lu, Yuhao Zhu

This paper presents a general framework to build fast and accurate algorithms for video enhancement tasks such as super-resolution, deblurring, and denoising.

Deblurring Denoising +4

Paper
Add Code

Block Skim Transformer for Efficient Question Answering

no code implementations • 1 Jan 2021 • Yue Guan, Jingwen Leng, Yuhao Zhu, Minyi Guo

Following this idea, we proposed Block Skim Transformer (BST) to improve and accelerate the processing of transformer QA models.

Language Modelling Model Compression +1

Paper
Add Code

Eudoxus: Characterizing and Accelerating Localization in Autonomous Machines

no code implementations • 2 Dec 2020 • Yiming Gan, Yu Bo, Boyuan Tian, Leimeng Xu, Wei Hu, Shaoshan Liu, Qiang Liu, Yanjun Zhang, Jie Tang, Yuhao Zhu

We develop and commercialize autonomous machines, such as logistic robots and self-driving cars, around the globe.

Self-Driving Cars Hardware Architecture

Paper
Add Code

End-to-End Framework for Efficient Deep Learning Using Metasurfaces Optics

1 code implementation • 23 Nov 2020 • Carlos Mauricio Villegas Burgos, Tianqi Yang, Nick Vamivakas, Yuhao Zhu

Deep learning using Convolutional Neural Networks (CNNs) has been shown to significantly out-performed many conventional vision algorithms.

Paper
Code

A Survey of FPGA-Based Robotic Computing

no code implementations • 13 Sep 2020 • Zishen Wan, Bo Yu, Thomas Yuang Li, Jie Tang, Yuhao Zhu, Yu Wang, Arijit Raychowdhury, Shaoshan Liu

On the other hand, FPGA-based robotic accelerators are becoming increasingly competitive alternatives, especially in latency-critical and power-limited scenarios.

Autonomous Vehicles

Paper
Add Code

Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity

1 code implementation • 29 Aug 2020 • Cong Guo, Bo Yang Hsueh, Jingwen Leng, Yuxian Qiu, Yue Guan, Zehuan Wang, Xiaoying Jia, Xipeng Li, Minyi Guo, Yuhao Zhu

Network pruning can reduce the high computation cost of deep neural network (DNN) models.

Network Pruning

138

Paper
Code

Mesorasi: Architecture Support for Point Cloud Analytics via Delayed-Aggregation

1 code implementation • 16 Aug 2020 • Yu Feng, Boyuan Tian, Tiancheng Xu, Paul Whatmough, Yuhao Zhu

Point cloud analytics is poised to become a key workload on battery-powered embedded and mobile platforms in a wide range of emerging application domains, such as autonomous driving, robotics, and augmented reality, where efficiency is paramount.

Autonomous Driving

Paper
Code

Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration

no code implementations • 18 Feb 2020 • Cong Guo, Yangjie Zhou, Jingwen Leng, Yuhao Zhu, Zidong Du, Quan Chen, Chao Li, Bin Yao, Minyi Guo

We propose Simultaneous Multi-mode Architecture (SMA), a novel architecture design and execution model that offers general-purpose programmability on DNN accelerators in order to accelerate end-to-end applications.

Paper
Add Code

Tigris: Architecture and Algorithms for 3D Perception in Point Clouds

1 code implementation • 16 Nov 2019 • Tiancheng Xu, Boyuan Tian, Yuhao Zhu

While KD-tree search is inherently sequential, we propose an acceleration-amenable data structure and search algorithm that exposes different forms of parallelism of KD-tree search in the context of point cloud registration.

3D Reconstruction Point Cloud Registration +1

Paper
Code

ASV: Accelerated Stereo Vision System

2 code implementations • 15 Nov 2019 • Yu Feng, Paul Whatmough, Yuhao Zhu

The key to ASV is to exploit unique characteristics inherent to stereo vision, and apply stereo-specific optimizations, both algorithmically and computationally.

Stereo Matching

Paper
Code

Automatic Neural Network Compression by Sparsity-Quantization Joint Learning: A Constrained Optimization-based Approach

1 code implementation • CVPR 2020 • Haichuan Yang, Shupeng Gui, Yuhao Zhu, Ji Liu

A key parameter that all existing compression techniques are sensitive to is the compression ratio (e. g., pruning sparsity, quantization bitwidth) of each layer.

Neural Network Compression Quantization

Paper
Code

Adversarial Defense Through Network Profiling Based Path Extraction

no code implementations • CVPR 2019 • Yuxian Qiu, Jingwen Leng, Cong Guo, Quan Chen, Chao Li, Minyi Guo, Yuhao Zhu

Recently, researchers have started decomposing deep neural network models according to their semantics or functions.

Adversarial Defense

Paper
Add Code

Joint Iris Segmentation and Localization Using Deep Multi-task Learning Framework

1 code implementation • 31 Jan 2019 • Caiyong Wang, Yuhao Zhu, Yunfan Liu, Ran He, Zhenan Sun

In this paper, we propose a deep multi-task learning framework, named as IrisParseNet, to exploit the inherent correlations between pupil, iris and sclera to boost up the performance of iris segmentation and localization in a unified model.

Ranked #1 on Iris Segmentation on CASIA

Iris Segmentation Multi-Task Learning +1

Paper
Code

ECC: Platform-Independent Energy-Constrained Deep Neural Network Compression via a Bilinear Regression Model

2 code implementations • CVPR 2019 • Haichuan Yang, Yuhao Zhu, Ji Liu

The energy estimate model allows us to formulate DNN compression as a constrained optimization that minimizes the DNN loss function over the energy constraint.

Neural Network Compression regression

Paper
Code

Recognizing Partial Biometric Patterns

1 code implementation • 17 Oct 2018 • Lingxiao He, Zhenan Sun, Yuhao Zhu, Yunbo Wang

Biometric recognition on partial captured targets is challenging, where only several partial observations of objects are available for matching.

Dictionary Learning Face Recognition +1

169

Paper
Code

SCALE-Sim: Systolic CNN Accelerator

8 code implementations • 16 Oct 2018 • Ananda Samajdar, Yuhao Zhu, Paul Whatmough, Matthew Mattina, Tushar Krishna

Systolic Arrays are one of the most popular compute substrates within Deep Learning accelerators today, as they provide extremely high efficiency for running dense matrix multiplications.

Distributed, Parallel, and Cluster Computing Hardware Architecture

314

Paper
Code

Effective Path: Know the Unknowns of Neural Network

no code implementations • 27 Sep 2018 • Yuxian Qiu, Jingwen Leng, Yuhao Zhu, Quan Chen, Chao Li, Minyi Guo

Despite their enormous success, there is still no solid understanding of deep neural network’s working mechanism.

Paper
Add Code

Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking

1 code implementation • ICLR 2019 • Haichuan Yang, Yuhao Zhu, Ji Liu

Deep Neural Networks (DNNs) are increasingly deployed in highly energy-constrained environments such as autonomous drones and wearable devices while at the same time must operate in real-time.

Paper
Code

Euphrates: Algorithm-SoC Co-Design for Low-Power Mobile Continuous Vision

no code implementations • 29 Mar 2018 • Yuhao Zhu, Anand Samajdar, Matthew Mattina, Paul Whatmough

Specifically, we propose to expose the motion data that is naturally generated by the Image Signal Processor (ISP) early in the vision pipeline to the CNN engine.

Paper
Add Code

Cloud No Longer a Silver Bullet, Edge to the Rescue

no code implementations • 15 Feb 2018 • Yuhao Zhu, Gu-Yeon Wei, David Brooks

This paper takes the position that, while cognitive computing today relies heavily on the cloud, we will soon see a paradigm shift where cognitive computing primarily happens on network edges.

Position

Paper
Add Code

Mobile Machine Learning Hardware at ARM: A Systems-on-Chip (SoC) Perspective

no code implementations • 19 Jan 2018 • Yuhao Zhu, Matthew Mattina, Paul Whatmough

Machine learning is playing an increasingly significant role in emerging mobile application domains such as AR/VR, ADAS, etc.

BIG-bench Machine Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.