Search Results for author: Haoyu He

Found 18 papers, 15 papers with code

LongVLM: Efficient Long Video Understanding via Large Language Models

1 code implementation • 4 Apr 2024 • Yuetian Weng, Mingfei Han, Haoyu He, Xiaojun Chang, Bohan Zhuang

In this way, we encode video representations that incorporate both local and global information, enabling the LLM to generate comprehensive responses for long-term videos.

Question Answering Video Question Answering +1

Paper
Code

Efficient Stitchable Task Adaptation

1 code implementation • 29 Nov 2023 • Haoyu He, Zizheng Pan, Jing Liu, Jianfei Cai, Bohan Zhuang

In this work, we present a novel framework, Efficient Stitchable Task Adaptation (ESTA), to efficiently produce a palette of fine-tuned models that adhere to diverse resource constraints.

Chatbot

Paper
Code

Mask Propagation for Efficient Video Semantic Segmentation

1 code implementation • NeurIPS 2023 • Yuetian Weng, Mingfei Han, Haoyu He, Mingjie Li, Lina Yao, Xiaojun Chang, Bohan Zhuang

By reusing predictions from key frames, we circumvent the need to process a large volume of video frames individually with resource-intensive segmentors, alleviating temporal redundancy and significantly reducing computational costs.

Semantic Segmentation Video Semantic Segmentation

Paper
Code

Stitched ViTs are Flexible Vision Backbones

1 code implementation • 30 Jun 2023 • Zizheng Pan, Jing Liu, Haoyu He, Jianfei Cai, Bohan Zhuang

With extensive experiments on ImageNet-1K, ADE20K, COCO-Stuff-10K and NYUv2, SN-Netv2 demonstrates superior performance over SN-Netv1 on downstream dense predictions and shows strong ability as a flexible vision backbone, achieving great advantages in both training efficiency and deployment flexibility.

Paper
Code

Illuminati: Towards Explaining Graph Neural Networks for Cybersecurity Analysis

1 code implementation • 26 Mar 2023 • Haoyu He, Yuede Ji, H. Howie Huang

Given a graph and a pre-trained GNN model, Illuminati is able to identify the important nodes, edges, and attributes that are contributing to the prediction while requiring no prior knowledge of GNN models.

Fraud Detection Vulnerability Detection

Paper
Code

Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning

1 code implementation • ICCV 2023 • Haoyu He, Jianfei Cai, Jing Zhang, DaCheng Tao, Bohan Zhuang

Visual Parameter-Efficient Fine-Tuning (PEFT) has become a powerful alternative for full fine-tuning so as to adapt pre-trained vision models to downstream tasks, which only tunes a small number of parameters while freezing the vast majority ones to ease storage burden and optimization difficulty.

Paper
Code

A Survey on Efficient Training of Transformers

no code implementations • 2 Feb 2023 • Bohan Zhuang, Jing Liu, Zizheng Pan, Haoyu He, Yuetian Weng, Chunhua Shen

Recent advances in Transformers have come with a huge requirement on computing resources, highlighting the importance of developing efficient training techniques to make Transformer training faster, at lower cost, and to higher accuracy by the efficient use of computation and memory resources.

Paper
Add Code

MAP-Music2Vec: A Simple and Effective Baseline for Self-Supervised Music Audio Representation Learning

no code implementations • 5 Dec 2022 • Yizhi Li, Ruibin Yuan, Ge Zhang, Yinghao Ma, Chenghua Lin, Xingran Chen, Anton Ragni, Hanzhi Yin, Zhijie Hu, Haoyu He, Emmanouil Benetos, Norbert Gyenge, Ruibo Liu, Jie Fu

The deep learning community has witnessed an exponentially growing interest in self-supervised learning (SSL).

Representation Learning Self-Supervised Learning

Paper
Add Code

EcoFormer: Energy-Saving Attention with Linear Complexity

1 code implementation • 19 Sep 2022 • Jing Liu, Zizheng Pan, Haoyu He, Jianfei Cai, Bohan Zhuang

To this end, we propose a new binarization paradigm customized to high-dimensional softmax attention via kernelized hashing, called EcoFormer, to map the original queries and keys into low-dimensional binary codes in Hamming space.

Binarization

Paper
Code

Dynamic Focus-aware Positional Queries for Semantic Segmentation

2 code implementations • CVPR 2023 • Haoyu He, Jianfei Cai, Zizheng Pan, Jing Liu, Jing Zhang, DaCheng Tao, Bohan Zhuang

In this paper, we propose a simple yet effective query design for semantic segmentation termed Dynamic Focus-aware Positional Queries (DFPQ), which dynamically generates positional queries conditioned on the cross-attention scores from the preceding decoder block and the positional encodings for the corresponding image features, simultaneously.

Ranked #21 on Semantic Segmentation on ADE20K

Semantic Segmentation

Paper
Code

Pruning Self-attentions into Convolutional Layers in Single Path

3 code implementations • 23 Nov 2021 • Haoyu He, Jianfei Cai, Jing Liu, Zizheng Pan, Jing Zhang, DaCheng Tao, Bohan Zhuang

Relying on the single-path space, we introduce learnable binary gates to encode the operation choices in MSA layers.

Ranked #18 on Efficient ViTs on ImageNet-1K (with DeiT-T)

Efficient ViTs Inductive Bias +1

Paper
Code

Mesa: A Memory-saving Training Framework for Transformers

3 code implementations • 22 Nov 2021 • Zizheng Pan, Peng Chen, Haoyu He, Jing Liu, Jianfei Cai, Bohan Zhuang

While Transformers have delivered significant performance improvements, training such networks is extremely memory intensive owing to storing all intermediate activations that are needed for gradient computation during backpropagation, especially for long sequences.

Quantization

119

Paper
Code

Distiller: A Systematic Study of Model Distillation Methods in Natural Language Processing

no code implementations • EMNLP (sustainlp) 2021 • Haoyu He, Xingjian Shi, Jonas Mueller, Zha Sheng, Mu Li, George Karypis

We aim to identify how different components in the KD pipeline affect the resulting performance and how much the optimal KD pipeline varies across different datasets/tasks, such as the data augmentation policy, the loss function, and the intermediate representation for transferring the knowledge between teacher and student.

Data Augmentation Hyperparameter Optimization

Paper
Add Code

Less is More: Pay Less Attention in Vision Transformers

2 code implementations • 29 May 2021 • Zizheng Pan, Bohan Zhuang, Haoyu He, Jing Liu, Jianfei Cai

Transformers have become one of the dominant architectures in deep learning, particularly as a powerful alternative to convolutional neural networks (CNNs) in computer vision.

Image Classification Instance Segmentation +3

Paper
Code

End-to-end One-shot Human Parsing

1 code implementation • 4 May 2021 • Haoyu He, Bohan Zhuang, Jing Zhang, Jianfei Cai, DaCheng Tao

To address three main challenges in OSHP, i. e., small sizes, testing bias, and similar parts, we devise an End-to-end One-shot human Parsing Network (EOP-Net).

Human Parsing Metric Learning +1

Paper
Code

Scalable Vision Transformers with Hierarchical Pooling

2 code implementations • ICCV 2021 • Zizheng Pan, Bohan Zhuang, Jing Liu, Haoyu He, Jianfei Cai

However, the routine of the current ViT model is to maintain a full-length patch sequence during inference, which is redundant and lacks hierarchical representation.

Ranked #22 on Efficient ViTs on ImageNet-1K (with DeiT-T)

Efficient ViTs

1,185

Paper
Code

Progressive One-shot Human Parsing

1 code implementation • 22 Dec 2020 • Haoyu He, Jing Zhang, Bhavani Thuraisingham, DaCheng Tao

In this paper, we devise a novel Progressive One-shot Parsing network (POPNet) to address two critical challenges , i. e., testing bias and small sizes.

Human Parsing Metric Learning +1

Paper
Code

Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing

1 code implementation • 27 Nov 2019 • Haoyu He, Jing Zhang, Qiming Zhang, DaCheng Tao

In this paper, we propose a novel GRAph PYramid Mutual Learning (Grapy-ML) method to address the cross-dataset human parsing problem, where the annotations are at different granularities.

Human Parsing Semantic Segmentation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.