Search Results for author: Hefeng Wu

Found 29 papers, 18 papers with code

ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning

no code implementations • 23 Apr 2024 • Weifeng Chen, Jiacheng Zhang, Jie Wu, Hefeng Wu, Xuefeng Xiao, Liang Lin

The rapid development of diffusion models has triggered diverse applications.

Paper
Add Code

DiffusionGPT: LLM-Driven Text-to-Image Generation System

no code implementations • 18 Jan 2024 • Jie Qin, Jie Wu, Weifeng Chen, Yuxi Ren, Huixia Li, Hefeng Wu, Xuefeng Xiao, Rui Wang, Shilei Wen

Diffusion models have opened up new avenues for the field of image generation, resulting in the proliferation of high-quality models shared on open-source platforms.

Model Selection Text-to-Image Generation

Paper
Add Code

Dual-View Data Hallucination with Semantic Relation Guidance for Few-Shot Image Recognition

no code implementations • 13 Jan 2024 • Hefeng Wu, Guangzhi Ye, Ziyang Zhou, Ling Tian, Qing Wang, Liang Lin

Specifically, an instance-view data hallucination module hallucinates each sample of a novel class to generate new data by employing local semantic correlated attention and global semantic feature fusion derived from base classes.

Hallucination Novel Concepts +1

Paper
Add Code

SQLNet: Scale-Modulated Query and Localization Network for Few-Shot Class-Agnostic Counting

1 code implementation • 16 Nov 2023 • Hefeng Wu, Yandong Chen, Lingbo Liu, Tianshui Chen, Keze Wang, Liang Lin

In the localization stage, the Scale-aware Multi-head Localization (SAML) module utilizes the query tensor to predict the confidence, location, and size of each potential object.

Paper
Code

Contrastive Transformer Learning with Proximity Data Generation for Text-Based Person Search

1 code implementation • 15 Nov 2023 • Hefeng Wu, Weifeng Chen, Zhibin Liu, Tianshui Chen, Zhiguang Chen, Liang Lin

Moreover, we propose a proximity data generation (PDG) module to automatically produce more diverse data for cross-modal training.

Contrastive Learning Cross-Modal Retrieval +4

Paper
Code

SketchBodyNet: A Sketch-Driven Multi-faceted Decoder Network for 3D Human Reconstruction

1 code implementation • 10 Oct 2023 • Fei Wang, Kongzhang Tang, Hefeng Wu, Baoquan Zhao, Hao Cai, Teng Zhou

Compared with natural images, freehand sketches are much more flexible to depict various shapes, providing a high potential and valuable way for 3D human reconstruction.

3D Human Reconstruction 3D Reconstruction

Paper
Code

Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation

1 code implementation • 23 Sep 2023 • Tao Pu, Tianshui Chen, Hefeng Wu, Yongyi Lu, Liang Lin

In this work, we propose a spatial-temporal knowledge-embedded transformer (STKET) that incorporates the prior spatial-temporal knowledge into the multi-head cross-attention mechanism to learn more representative relationship representations.

Graph Generation Object +2

Paper
Code

Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models

1 code implementation • 23 May 2023 • Weifeng Chen, Yatai Ji, Jie Wu, Hefeng Wu, Pan Xie, Jiashi Li, Xin Xia, Xuefeng Xiao, Liang Lin

Based on a pre-trained conditional text-to-image (T2I) diffusion model, our model aims to generate videos conditioned on a sequence of control signals, such as edge or depth maps.

Optical Flow Estimation Style Transfer +4

338

Paper
Code

Multi-object Video Generation from Single Frame Layouts

no code implementations • 6 May 2023 • Yang Wu, Zhibin Liu, Hefeng Wu, Liang Lin

In this paper, we study video synthesis with emphasis on simplifying the generation conditions.

Image Generation Object +2

Paper
Add Code

Category-Adaptive Label Discovery and Noise Rejection for Multi-label Image Recognition with Partial Positive Labels

no code implementations • 15 Nov 2022 • Tao Pu, Qianru Lao, Hefeng Wu, Tianshui Chen, Liang Lin

To reject noisy labels, recent works regard large loss samples as noise but ignore the semantic correlation different multi-label images.

Paper
Add Code

Dual-Perspective Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels

1 code implementation • 26 May 2022 • Tao Pu, Tianshui Chen, Hefeng Wu, Yukai Shi, Zhijing Yang, Liang Lin

Specifically, an instance-perspective representation blending (IPRB) module is designed to blend the representations of the known labels in an image with the representations of the corresponding unknown labels in another image to complement these unknown labels.

Ranked #3 on Multi-label Image Recognition with Partial Labels on PASCAL VOC 2007

Image Classification Multi-label Image Recognition with Partial Labels

Paper
Code

Semantic Representation and Dependency Learning for Multi-Label Image Recognition

no code implementations • 8 Apr 2022 • Tao Pu, Mingzhan Sun, Hefeng Wu, Tianshui Chen, Ling Tian, Liang Lin

We also design an object erasing (OE) module to implicitly learn semantic dependency among categories by erasing semantic-aware regions to regularize the network training.

Object object-detection +1

Paper
Add Code

Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels

1 code implementation • 4 Mar 2022 • Tao Pu, Tianshui Chen, Hefeng Wu, Liang Lin

However, these algorithms depend on sufficient multi-label annotations to train the models, leading to poor performance especially with low known label proportion.

Ranked #2 on Multi-label Image Recognition with Partial Labels on Visual Genome

Multi-label Image Recognition with Partial Labels

Paper
Code

Structured Semantic Transfer for Multi-Label Recognition with Partial Labels

1 code implementation • 21 Dec 2021 • Tianshui Chen, Tao Pu, Hefeng Wu, Yuan Xie, Liang Lin

To reduce the annotation cost, we propose a structured semantic transfer (SST) framework that enables training multi-label recognition models with partial labels, i. e., merely some labels are known while other labels are missing (also called unknown labels) per image.

Ranked #6 on Multi-label Image Recognition with Partial Labels on PASCAL VOC 2007

Multi-label Image Recognition with Partial Labels

Paper
Code

AU-Expression Knowledge Constrained Representation Learning for Facial Expression Recognition

1 code implementation • 29 Dec 2020 • Tao Pu, Tianshui Chen, Yuan Xie, Hefeng Wu, Liang Lin

In this work, we explore the correlations among the action units and facial expressions, and devise an AU-Expression Knowledge Constrained Representation Learning (AUE-CRL) framework to learn the AU representations without AU annotations and adaptively use representations to facilitate facial expression recognition.

Facial Expression Recognition Facial Expression Recognition (FER) +1

Paper
Code

Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting

1 code implementation • CVPR 2021 • Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin

Extensive experiments conducted on the RGBT-CC benchmark demonstrate the effectiveness of our framework for RGBT crowd counting.

Crowd Counting Representation Learning

Paper
Code

Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition

no code implementations • 20 Sep 2020 • Tianshui Chen, Liang Lin, Riquan Chen, Xiaolu Hui, Hefeng Wu

The framework exploits prior knowledge to guide adaptive information propagation among different categories to facilitate multi-label analysis and reduce the dependency of training samples.

Few-Shot Learning Multi-label Image Recognition with Partial Labels

Paper
Add Code

Adversarial Graph Representation Adaptation for Cross-Domain Facial Expression Recognition

1 code implementation • 3 Aug 2020 • Yuan Xie, Tianshui Chen, Tao Pu, Hefeng Wu, Liang Lin

However, most of these works focus on holistic feature adaptation, and they ignore local features that are more transferable across different datasets.

Cross-Domain Facial Expression Recognition Facial Expression Recognition (FER)

105

Paper
Code

Cross-Domain Facial Expression Recognition: A Unified Evaluation Benchmark and Adversarial Graph Learning

1 code implementation • 3 Aug 2020 • Tianshui Chen, Tao Pu, Hefeng Wu, Yuan Xie, Lingbo Liu, Liang Lin

Although each declares to achieve superior performance, fair comparisons are lacking due to the inconsistent choices of the source/target datasets and feature extractors.

Ranked #1 on Cross-Domain Facial Expression Recognition on Source: AFE, Target: CK+, JAFFE, SFEW2.0, FER2013, ExpW

Cross-Domain Facial Expression Recognition Domain Adaptation +3

105

Paper
Code

Fine-Grained Image Captioning with Global-Local Discriminative Objective

1 code implementation • 21 Jul 2020 • Jie Wu, Tianshui Chen, Hefeng Wu, Zhi Yang, Guangchun Luo, Liang Lin

This is primarily due to (i) the conservative characteristic of traditional training objectives that drives the model to generate correct but hardly discriminative captions for similar images and (ii) the uneven word distribution of the ground-truth captions, which encourages generating highly frequent words/phrases while suppressing the less frequent but more concrete ones.

Descriptive Image Captioning +2

Paper
Code

Efficient Crowd Counting via Structured Knowledge Transfer

2 code implementations • 23 Mar 2020 • Lingbo Liu, Jiaqi Chen, Hefeng Wu, Tianshui Chen, Guanbin Li, Liang Lin

Crowd counting is an application-oriented task and its inference efficiency is crucial for real-world applications.

Crowd Counting Transfer Learning

Paper
Code

Physical-Virtual Collaboration Modeling for Intra-and Inter-Station Metro Ridership Prediction

2 code implementations • 14 Jan 2020 • Lingbo Liu, Jingwen Chen, Hefeng Wu, Jiajie Zhen, Guanbin Li, Liang Lin

To address this problem, we model a metro system as graphs with various topologies and propose a unified Physical-Virtual Collaboration Graph Network (PVCGN), which can effectively learn the complex ridership patterns from the tailor-designed graphs.

Representation Learning

Paper
Code

Knowledge Graph Transfer Network for Few-Shot Recognition

1 code implementation • 21 Nov 2019 • Riquan Chen, Tianshui Chen, Xiaolu Hui, Hefeng Wu, Guanbin Li, Liang Lin

In this work, we represent the semantic correlations in the form of structured knowledge graph and integrate this graph into deep neural networks to promote few-shot learning by a novel Knowledge Graph Transfer Network (KGTN).

Ranked #1 on Few-Shot Image Classification on ImageNet-FS (10-shot, all)

Few-Shot Image Classification Few-Shot Learning +2

Paper
Code

Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition

2 code implementations • ICCV 2019 • Tianshui Chen, Muxin Xu, Xiaolu Hui, Hefeng Wu, Liang Lin

Recognizing multiple labels of images is a practical and challenging task, and significant progress has been made by searching semantic-aware regions and modeling label dependency.

Ranked #8 on Multi-Label Classification on PASCAL VOC 2007

Graph Representation Learning Multi-Label Classification +1

158

Paper
Code

Instance-Aware Representation Learning and Association for Online Multi-Person Tracking

no code implementations • 29 May 2019 • Hefeng Wu, Yafei Hu, Keze Wang, Hanhui Li, Lin Nie, Hui Cheng

Multi-Person Tracking (MPT) is often addressed within the detection-to-association paradigm.

Representation Learning

Paper
Add Code

Multi-column Point-CNN for Sketch Segmentation

no code implementations • 28 Dec 2018 • Fei Wang, Shujin Lin, Hanhui Li, Hefeng Wu, Junkun Jiang, Ruomei Wang, Xiaonan Luo

Traditional sketch segmentation methods mainly rely on handcrafted features and complicate models, and their performance is far from satisfactory due to the abstract representation of sketches.

Paper
Add Code

ADCrowdNet: An Attention-injective Deformable Convolutional Network for Crowd Understanding

1 code implementation • CVPR 2019 • Ning Liu, Yongchao Long, Changqing Zou, Qun Niu, Li Pan, Hefeng Wu

We propose an attention-injective deformable convolutional network called ADCrowdNet for crowd understanding that can address the accuracy degradation problem of highly congested noisy scenes.

Ranked #2 on Crowd Counting on TRANCOS

Crowd Counting

Paper
Code

Structured Inhomogeneous Density Map Learning for Crowd Counting

no code implementations • 20 Jan 2018 • Hanhui Li, Xiangjian He, Hefeng Wu, Saeed Amirgholipour Kasmani, Ruomei Wang, Xiaonan Luo, Liang Lin

In this paper, we aim at tackling the problem of crowd counting in extremely high-density scenes, which contain hundreds, or even thousands of people.

Crowd Counting

Paper
Add Code

Learning Deep Similarity Models with Focus Ranking for Fabric Image Retrieval

no code implementations • 29 Dec 2017 • Daiguo Deng, Ruomei Wang, Hefeng Wu, Huayong He, Qi Li, Xiaonan Luo

Fabric image retrieval is beneficial to many applications including clothing searching, online shopping and cloth modeling.

Image Retrieval Representation Learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.