Search Results for author: Hefeng Wu

Found 29 papers, 18 papers with code

DiffusionGPT: LLM-Driven Text-to-Image Generation System

no code implementations18 Jan 2024 Jie Qin, Jie Wu, Weifeng Chen, Yuxi Ren, Huixia Li, Hefeng Wu, Xuefeng Xiao, Rui Wang, Shilei Wen

Diffusion models have opened up new avenues for the field of image generation, resulting in the proliferation of high-quality models shared on open-source platforms.

Model Selection Text-to-Image Generation

Dual-View Data Hallucination with Semantic Relation Guidance for Few-Shot Image Recognition

no code implementations13 Jan 2024 Hefeng Wu, Guangzhi Ye, Ziyang Zhou, Ling Tian, Qing Wang, Liang Lin

Specifically, an instance-view data hallucination module hallucinates each sample of a novel class to generate new data by employing local semantic correlated attention and global semantic feature fusion derived from base classes.

Hallucination Novel Concepts +1

SQLNet: Scale-Modulated Query and Localization Network for Few-Shot Class-Agnostic Counting

1 code implementation16 Nov 2023 Hefeng Wu, Yandong Chen, Lingbo Liu, Tianshui Chen, Keze Wang, Liang Lin

In the localization stage, the Scale-aware Multi-head Localization (SAML) module utilizes the query tensor to predict the confidence, location, and size of each potential object.

SketchBodyNet: A Sketch-Driven Multi-faceted Decoder Network for 3D Human Reconstruction

1 code implementation10 Oct 2023 Fei Wang, Kongzhang Tang, Hefeng Wu, Baoquan Zhao, Hao Cai, Teng Zhou

Compared with natural images, freehand sketches are much more flexible to depict various shapes, providing a high potential and valuable way for 3D human reconstruction.

3D Human Reconstruction 3D Reconstruction

Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation

1 code implementation23 Sep 2023 Tao Pu, Tianshui Chen, Hefeng Wu, Yongyi Lu, Liang Lin

In this work, we propose a spatial-temporal knowledge-embedded transformer (STKET) that incorporates the prior spatial-temporal knowledge into the multi-head cross-attention mechanism to learn more representative relationship representations.

Graph Generation Object +2

Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models

1 code implementation23 May 2023 Weifeng Chen, Yatai Ji, Jie Wu, Hefeng Wu, Pan Xie, Jiashi Li, Xin Xia, Xuefeng Xiao, Liang Lin

Based on a pre-trained conditional text-to-image (T2I) diffusion model, our model aims to generate videos conditioned on a sequence of control signals, such as edge or depth maps.

Optical Flow Estimation Style Transfer +4

Multi-object Video Generation from Single Frame Layouts

no code implementations6 May 2023 Yang Wu, Zhibin Liu, Hefeng Wu, Liang Lin

In this paper, we study video synthesis with emphasis on simplifying the generation conditions.

Image Generation Object +2

Dual-Perspective Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels

1 code implementation26 May 2022 Tao Pu, Tianshui Chen, Hefeng Wu, Yukai Shi, Zhijing Yang, Liang Lin

Specifically, an instance-perspective representation blending (IPRB) module is designed to blend the representations of the known labels in an image with the representations of the corresponding unknown labels in another image to complement these unknown labels.

Image Classification Multi-label Image Recognition with Partial Labels

Semantic Representation and Dependency Learning for Multi-Label Image Recognition

no code implementations8 Apr 2022 Tao Pu, Mingzhan Sun, Hefeng Wu, Tianshui Chen, Ling Tian, Liang Lin

We also design an object erasing (OE) module to implicitly learn semantic dependency among categories by erasing semantic-aware regions to regularize the network training.

Object object-detection +1

Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels

1 code implementation4 Mar 2022 Tao Pu, Tianshui Chen, Hefeng Wu, Liang Lin

However, these algorithms depend on sufficient multi-label annotations to train the models, leading to poor performance especially with low known label proportion.

Multi-label Image Recognition with Partial Labels

Structured Semantic Transfer for Multi-Label Recognition with Partial Labels

1 code implementation21 Dec 2021 Tianshui Chen, Tao Pu, Hefeng Wu, Yuan Xie, Liang Lin

To reduce the annotation cost, we propose a structured semantic transfer (SST) framework that enables training multi-label recognition models with partial labels, i. e., merely some labels are known while other labels are missing (also called unknown labels) per image.

Multi-label Image Recognition with Partial Labels

AU-Expression Knowledge Constrained Representation Learning for Facial Expression Recognition

1 code implementation29 Dec 2020 Tao Pu, Tianshui Chen, Yuan Xie, Hefeng Wu, Liang Lin

In this work, we explore the correlations among the action units and facial expressions, and devise an AU-Expression Knowledge Constrained Representation Learning (AUE-CRL) framework to learn the AU representations without AU annotations and adaptively use representations to facilitate facial expression recognition.

Facial Expression Recognition Facial Expression Recognition (FER) +1

Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition

no code implementations20 Sep 2020 Tianshui Chen, Liang Lin, Riquan Chen, Xiaolu Hui, Hefeng Wu

The framework exploits prior knowledge to guide adaptive information propagation among different categories to facilitate multi-label analysis and reduce the dependency of training samples.

Few-Shot Learning Multi-label Image Recognition with Partial Labels

Adversarial Graph Representation Adaptation for Cross-Domain Facial Expression Recognition

1 code implementation3 Aug 2020 Yuan Xie, Tianshui Chen, Tao Pu, Hefeng Wu, Liang Lin

However, most of these works focus on holistic feature adaptation, and they ignore local features that are more transferable across different datasets.

Cross-Domain Facial Expression Recognition Facial Expression Recognition (FER)

Fine-Grained Image Captioning with Global-Local Discriminative Objective

1 code implementation21 Jul 2020 Jie Wu, Tianshui Chen, Hefeng Wu, Zhi Yang, Guangchun Luo, Liang Lin

This is primarily due to (i) the conservative characteristic of traditional training objectives that drives the model to generate correct but hardly discriminative captions for similar images and (ii) the uneven word distribution of the ground-truth captions, which encourages generating highly frequent words/phrases while suppressing the less frequent but more concrete ones.

Descriptive Image Captioning +2

Efficient Crowd Counting via Structured Knowledge Transfer

2 code implementations23 Mar 2020 Lingbo Liu, Jiaqi Chen, Hefeng Wu, Tianshui Chen, Guanbin Li, Liang Lin

Crowd counting is an application-oriented task and its inference efficiency is crucial for real-world applications.

Crowd Counting Transfer Learning

Physical-Virtual Collaboration Modeling for Intra-and Inter-Station Metro Ridership Prediction

2 code implementations14 Jan 2020 Lingbo Liu, Jingwen Chen, Hefeng Wu, Jiajie Zhen, Guanbin Li, Liang Lin

To address this problem, we model a metro system as graphs with various topologies and propose a unified Physical-Virtual Collaboration Graph Network (PVCGN), which can effectively learn the complex ridership patterns from the tailor-designed graphs.

Representation Learning

Knowledge Graph Transfer Network for Few-Shot Recognition

1 code implementation21 Nov 2019 Riquan Chen, Tianshui Chen, Xiaolu Hui, Hefeng Wu, Guanbin Li, Liang Lin

In this work, we represent the semantic correlations in the form of structured knowledge graph and integrate this graph into deep neural networks to promote few-shot learning by a novel Knowledge Graph Transfer Network (KGTN).

Few-Shot Image Classification Few-Shot Learning +2

Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition

2 code implementations ICCV 2019 Tianshui Chen, Muxin Xu, Xiaolu Hui, Hefeng Wu, Liang Lin

Recognizing multiple labels of images is a practical and challenging task, and significant progress has been made by searching semantic-aware regions and modeling label dependency.

Graph Representation Learning Multi-Label Classification +1

Multi-column Point-CNN for Sketch Segmentation

no code implementations28 Dec 2018 Fei Wang, Shujin Lin, Hanhui Li, Hefeng Wu, Junkun Jiang, Ruomei Wang, Xiaonan Luo

Traditional sketch segmentation methods mainly rely on handcrafted features and complicate models, and their performance is far from satisfactory due to the abstract representation of sketches.

ADCrowdNet: An Attention-injective Deformable Convolutional Network for Crowd Understanding

1 code implementation CVPR 2019 Ning Liu, Yongchao Long, Changqing Zou, Qun Niu, Li Pan, Hefeng Wu

We propose an attention-injective deformable convolutional network called ADCrowdNet for crowd understanding that can address the accuracy degradation problem of highly congested noisy scenes.

Crowd Counting

Structured Inhomogeneous Density Map Learning for Crowd Counting

no code implementations20 Jan 2018 Hanhui Li, Xiangjian He, Hefeng Wu, Saeed Amirgholipour Kasmani, Ruomei Wang, Xiaonan Luo, Liang Lin

In this paper, we aim at tackling the problem of crowd counting in extremely high-density scenes, which contain hundreds, or even thousands of people.

Crowd Counting

Learning Deep Similarity Models with Focus Ranking for Fabric Image Retrieval

no code implementations29 Dec 2017 Daiguo Deng, Ruomei Wang, Hefeng Wu, Huayong He, Qi Li, Xiaonan Luo

Fabric image retrieval is beneficial to many applications including clothing searching, online shopping and cloth modeling.

Image Retrieval Representation Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.