1 code implementation • 28 Mar 2024 • Ji Lin, Ligeng Zhu, Wei-Ming Chen, Wei-Chen Wang, Song Han
By squeezing deep learning models into billions of IoT devices and microcontrollers (MCUs), we expand the scope of AI applications and enable ubiquitous intelligence.
2 code implementations • 12 Dec 2023 • Ji Lin, Hongxu Yin, Wei Ping, Yao Lu, Pavlo Molchanov, Andrew Tao, Huizi Mao, Jan Kautz, Mohammad Shoeybi, Song Han
Visual language models (VLMs) rapidly progressed with the recent success of large language models.
Ranked #23 on Visual Question Answering on MM-Vet
no code implementations • 1 Dec 2023 • Qisong Li, Ji Lin, Sijia Wei, Neng Liu
Recent studies focus on embedding learning over knowledge graphs, which map entities and relations in knowledge graphs into low-dimensional vector spaces.
no code implementations • 26 Oct 2023 • Ligeng Zhu, Lanxiang Hu, Ji Lin, Wei-Chen Wang, Wei-Ming Chen, Chuang Gan, Song Han
On-device learning and efficient fine-tuning enable continuous and privacy-preserving customization (e. g., locally fine-tuning large language models on personalized data).
5 code implementations • 1 Jun 2023 • Ji Lin, Jiaming Tang, Haotian Tang, Shang Yang, Wei-Ming Chen, Wei-Chen Wang, Guangxuan Xiao, Xingyu Dang, Chuang Gan, Song Han
We then propose to search for the optimal per-channel scaling that protects the salient weights by observing the activation, not weights.
1 code implementation • 9 Feb 2023 • Guangxuan Xiao, Ji Lin, Song Han
In this paper, we propose Offsite-Tuning, a privacy-preserving and efficient transfer learning framework that can adapt billion-parameter foundation models to downstream data without access to the full model.
4 code implementations • 18 Nov 2022 • Guangxuan Xiao, Ji Lin, Mickael Seznec, Hao Wu, Julien Demouth, Song Han
We propose SmoothQuant, a training-free, accuracy-preserving, and general-purpose post-training quantization (PTQ) solution to enable 8-bit weight, 8-bit activation (W8A8) quantization for LLMs.
1 code implementation • 3 Nov 2022 • Muyang Li, Ji Lin, Chenlin Meng, Stefano Ermon, Song Han, Jun-Yan Zhu
With about $1\%$-area edits, SIGE accelerates DDPM by $3. 0\times$ on NVIDIA RTX 3090 and $4. 6\times$ on Apple M1 Pro GPU, Stable Diffusion by $7. 2\times$ on 3090, and GauGAN by $5. 6\times$ on 3090 and $5. 2\times$ on M1 Pro GPU.
1 code implementation • 30 Jun 2022 • Ji Lin, Ligeng Zhu, Wei-Ming Chen, Wei-Chen Wang, Chuang Gan, Song Han
To reduce the memory footprint, we propose Sparse Update to skip the gradient computation of less important layers and sub-tensors.
no code implementations • 25 Apr 2022 • Han Cai, Ji Lin, Yujun Lin, Zhijian Liu, Haotian Tang, Hanrui Wang, Ligeng Zhu, Song Han
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI), including computer vision, natural language processing and speech recognition.
no code implementations • NeurIPS 2021 • Ji Lin, Wei-Ming Chen, Han Cai, Chuang Gan, Song Han
We further propose receptive field redistribution to shift the receptive field and FLOPs to the later stage and reduce the computation overhead.
1 code implementation • 28 Oct 2021 • Ji Lin, Wei-Ming Chen, Han Cai, Chuang Gan, Song Han
We further propose network redistribution to shift the receptive field and FLOPs to the later stage and reduce the computation overhead.
no code implementations • ICLR 2022 • Han Cai, Chuang Gan, Ji Lin, Song Han
We introduce Network Augmentation (NetAug), a new training method for improving the performance of tiny neural networks.
4 code implementations • 27 Sep 2021 • Ji Lin, Chuang Gan, Kuan Wang, Song Han
Secondly, TSM has high efficiency; it achieves a high frame rate of 74fps and 29fps for online video recognition on Jetson Nano and Galaxy Note8.
1 code implementation • CVPR 2021 • Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zhu
Generative adversarial networks (GANs) have enabled photorealistic image synthesis and editing.
no code implementations • 11 Aug 2020 • Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, Song Han
Compared with conventional methods, our framework is fully automated and can specialize the quantization policy for different neural network architectures and hardware architectures.
6 code implementations • ECCV 2020 • Haotian Tang, Zhijian Liu, Shengyu Zhao, Yujun Lin, Ji Lin, Hanrui Wang, Song Han
Self-driving cars need to understand 3D scenes efficiently and accurately in order to drive safely.
Ranked #1 on Robust 3D Semantic Segmentation on SemanticKITTI-C
1 code implementation • NeurIPS 2020 • Ji Lin, Wei-Ming Chen, Yujun Lin, John Cohn, Chuang Gan, Song Han
Machine learning on tiny IoT devices based on microcontroller units (MCU) is appealing but challenging: the memory of microcontrollers is 2-3 orders of magnitude smaller even than mobile phones.
11 code implementations • NeurIPS 2020 • Shengyu Zhao, Zhijian Liu, Ji Lin, Jun-Yan Zhu, Song Han
Furthermore, with only 20% training data, we can match the top performance on CIFAR-10 and CIFAR-100.
Ranked #1 on Image Generation on CIFAR-10 (20% data)
1 code implementation • CVPR 2020 • Tianzhe Wang, Kuan Wang, Han Cai, Ji Lin, Zhijian Liu, Song Han
However, training this quantization-aware accuracy predictor requires collecting a large number of quantized <model, accuracy> pairs, which involves quantization-aware finetuning and thus is highly time-consuming.
2 code implementations • ICLR 2020 • Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
For language modeling, Lite Transformer achieves 1. 8 lower perplexity than the transformer at around 500M MACs.
Ranked #36 on Machine Translation on WMT2014 English-French
1 code implementation • CVPR 2020 • Muyang Li, Ji Lin, Yaoyao Ding, Zhijian Liu, Jun-Yan Zhu, Song Han
Directly applying existing compression methods yields poor performance due to the difficulty of GAN training and the differences in generator architectures.
no code implementations • 1 Oct 2019 • Ji Lin, Chuang Gan, Song Han
With such hardware-aware model design, we are able to scale up the training on Summit supercomputer and reduce the training time on Kinetics dataset from 49 hours 55 minutes to 14 minutes 13 seconds, achieving a top-1 accuracy of 74. 0%, which is 1. 6x and 2. 9x faster than previous 3D video models with higher accuracy.
no code implementations • 24 Apr 2019 • Song Han, Han Cai, Ligeng Zhu, Ji Lin, Kuan Wang, Zhijian Liu, Yujun Lin
Moreover, we shorten the design cycle by 200x than previous work, so that we can afford to design specialized neural network models for different hardware platforms.
no code implementations • ICLR 2019 • Ji Lin, Chuang Gan, Song Han
This paper aims to raise people's awareness about the security of the quantized models, and we designed a novel quantization methodology to jointly optimize the efficiency and robustness of deep learning models.
1 code implementation • ICCV 2019 • Hou-Ning Hu, Qi-Zhi Cai, Dequan Wang, Ji Lin, Min Sun, Philipp Krähenbühl, Trevor Darrell, Fisher Yu
The framework can not only associate detections of vehicles in motion over time, but also estimate their complete 3D bounding box information from a sequence of 2D images captured on a moving platform.
Ranked #12 on Multiple Object Tracking on KITTI Tracking test
11 code implementations • CVPR 2019 • Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, Song Han
Compared with conventional methods, our framework is fully automated and can specialize the quantization policy for different neural network architectures and hardware architectures.
13 code implementations • ICCV 2019 • Ji Lin, Chuang Gan, Song Han
The explosive growth in video streaming gives rise to challenges on performing video understanding at high accuracy and low computation cost.
Ranked #2 on 3D Action Recognition on Assembly101
no code implementations • NIPS Workshop CDNNRIA 2018 • Ting Chen, Ji Lin, Tian Lin, Song Han, Chong Wang, Denny Zhou
Modern deep neural networks have a large amount of weights, which make them difficult to deploy on computation constrained devices such as mobile phones.
no code implementations • ICLR 2018 • Yang Gao, Huazhe Xu, Ji Lin, Fisher Yu, Sergey Levine, Trevor Darrell
We propose a unified reinforcement learning algorithm, Normalized Actor-Critic (NAC), that effectively normalizes the Q-function, reducing the Q-values of actions unseen in the demonstration data.
12 code implementations • ECCV 2018 • Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, Song Han
Model compression is a critical technique to efficiently deploy neural network models on mobile devices which have limited computation resources and tight power budgets.
no code implementations • NeurIPS 2017 • Ji Lin, Yongming Rao, Jiwen Lu, Jie zhou
In this paper, we propose a Runtime Neural Pruning (RNP) framework which prunes the deep neural network dynamically at the runtime.
no code implementations • ICCV 2017 • Yongming Rao, Ji Lin, Jiwen Lu, Jie zhou
In this paper, we propose a discriminative aggregation network (DAN) for video face recognition, which aims to integrate information from video frames effectively and efficiently.
no code implementations • CVPR 2017 • Ji Lin, Liangliang Ren, Jiwen Lu, Jianjiang Feng, Jie zhou
In this paper, we propose a consistent-aware deep learning (CADL) framework for person re-identification in a camera network.