Search Results for author: Haotian Tang

Found 13 papers, 10 papers with code

TorchSparse++: Efficient Training and Inference Framework for Sparse Convolution on GPUs

1 code implementation25 Oct 2023 Haotian Tang, Shang Yang, Zhijian Liu, Ke Hong, Zhongming Yu, Xiuyu Li, Guohao Dai, Yu Wang, Song Han

On top of this, we design the Sparse Autotuner, which extends the design space of existing sparse convolution libraries and searches for the best dataflow configurations for training and inference workloads.

Autonomous Driving Recommendation Systems

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

2 code implementations21 Sep 2023 Yukang Chen, Shengju Qian, Haotian Tang, Xin Lai, Zhijian Liu, Song Han, Jiaya Jia

For example, training on the context length of 8192 needs 16x computational costs in self-attention layers as that of 2048.

4k Instruction Following +2

Type-supervised sequence labeling based on the heterogeneous star graph for named entity recognition

1 code implementation19 Oct 2022 Xueru Wen, Changjiang Zhou, Haotian Tang, Luguang Liang, Yu Jiang, Hong Qi

Named entity recognition is a fundamental task in natural language processing, identifying the span and category of entities in unstructured texts.

Graph Attention named-entity-recognition +4

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

no code implementations25 Apr 2022 Han Cai, Ji Lin, Yujun Lin, Zhijian Liu, Haotian Tang, Hanrui Wang, Ligeng Zhu, Song Han

Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI), including computer vision, natural language processing and speech recognition.

Model Compression Neural Architecture Search +3

TorchSparse: Efficient Point Cloud Inference Engine

1 code implementation21 Apr 2022 Haotian Tang, Zhijian Liu, Xiuyu Li, Yujun Lin, Song Han

TorchSparse directly optimizes the two bottlenecks of sparse convolution: irregular computation and data movement.

Autonomous Driving

Point-Voxel CNN for Efficient 3D Deep Learning

4 code implementations NeurIPS 2019 Zhijian Liu, Haotian Tang, Yujun Lin, Song Han

The computation cost and memory footprints of the voxel-based models grow cubically with the input resolution, making it memory-prohibitive to scale up the resolution.

3D Object Detection 3D Semantic Segmentation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.