Search Results for author: Huaibo Huang

Found 45 papers, 12 papers with code

Hierarchical Face Aging through Disentangled Latent Characteristics

no code implementations ECCV 2020 Pei-Pei Li, Huaibo Huang, Yibo Hu, Xiang Wu, Ran He, Zhenan Sun

To explore the age effects on facial images, we propose a Disentangled Adversarial Autoencoder (DAAE) to disentangle the facial images into three independent factors: age, identity and extraneous information.

Age Estimation MORPH

ViTAR: Vision Transformer with Any Resolution

no code implementations27 Mar 2024 Qihang Fan, Quanzeng You, Xiaotian Han, Yongfei Liu, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yang

Firstly, we propose a novel module for dynamic resolution adjustment, designed with a single Transformer block, specifically to achieve highly efficient incremental token integration.

Self-Supervised Learning Semantic Segmentation

DiffMAC: Diffusion Manifold Hallucination Correction for High Generalization Blind Face Restoration

no code implementations15 Mar 2024 Nan Gao, Jia Li, Huaibo Huang, Zhi Zeng, Ke Shang, Shuwu Zhang, Ran He

Experimental results demonstrate the superiority of DiffMAC over state-of-the-art methods, with a high degree of generalization in real-world and heterogeneous settings.

Attribute Blind Face Restoration +1

FakeNewsGPT4: Advancing Multimodal Fake News Detection through Knowledge-Augmented LVLMs

no code implementations4 Mar 2024 Xuannan Liu, Peipei Li, Huaibo Huang, Zekun Li, Xing Cui, Jiahao Liang, Lixiong Qin, Weihong Deng, Zhaofeng He

In this paper, we propose FakeNewsGPT4, a novel framework that augments Large Vision-Language Models (LVLMs) with forgery-specific knowledge for manipulation reasoning while inheriting extensive world knowledge as complementary.

Fake News Detection Informativeness +1

Multimodal Prompt Perceiver: Empower Adaptiveness, Generalizability and Fidelity for All-in-One Image Restoration

no code implementations5 Dec 2023 Yuang Ai, Huaibo Huang, Xiaoqiang Zhou, Jiexiang Wang, Ran He

Extensive experiments on 16 IR tasks underscore the superiority of MPerceiver in terms of adaptiveness, generalizability and fidelity.

Image Restoration

Portrait Diffusion: Training-free Face Stylization with Chain-of-Painting

1 code implementation3 Dec 2023 Jin Liu, Huaibo Huang, Chao Jin, Ran He

Face stylization refers to the transformation of a face into a specific portrait style.

Image Reconstruction

Exploring Straighter Trajectories of Flow Matching with Diffusion Guidance

no code implementations28 Nov 2023 Siyu Xing, Jie Cao, Huaibo Huang, Xiao-Yu Zhang, Ran He

First, we propose a coupling strategy to straighten trajectories, creating couplings between image and noise samples under diffusion model guidance.

InstaStyle: Inversion Noise of a Stylized Image is Secretly a Style Adviser

no code implementations25 Nov 2023 Xing Cui, Zekun Li, Pei Pei Li, Huaibo Huang, Zhaofeng He

We employ DDIM inversion to extract this noise from the reference image and leverage a diffusion model to generate new stylized images from the "style" noise.

Text-to-Image Generation

Video-CSR: Complex Video Digest Creation for Visual-Language Models

no code implementations8 Oct 2023 Tingkai Liu, Yunzhe Tao, Haogeng Liu, Qihang Fan, Ding Zhou, Huaibo Huang, Ran He, Hongxia Yang

We present a novel task and human annotated dataset for evaluating the ability for visual-language models to generate captions and summaries for real-world video clips, which we call Video-CSR (Captioning, Summarization and Retrieval).

Retrieval Sentence +1

Video-Teller: Enhancing Cross-Modal Generation with Fusion and Decoupling

no code implementations8 Oct 2023 Haogeng Liu, Qihang Fan, Tingkai Liu, Linjie Yang, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yang

This paper proposes Video-Teller, a video-language foundation model that leverages multi-modal fusion and fine-grained modality alignment to significantly enhance the video-to-text generation task.

Text Generation Video Summarization

RMT: Retentive Networks Meet Vision Transformers

1 code implementation20 Sep 2023 Qihang Fan, Huaibo Huang, Mingrui Chen, Hongmin Liu, Ran He

To alleviate these issues, we draw inspiration from the recent Retentive Network (RetNet) in the field of NLP, and propose RMT, a strong vision backbone with explicit spatial prior for general purposes.

Instance Segmentation object-detection +2

Learning-to-Rank Meets Language: Boosting Language-Driven Ordering Alignment for Ordinal Classification

2 code implementations NeurIPS 2023 Rui Wang, Peipei Li, Huaibo Huang, Chunshui Cao, Ran He, Zhaofeng He

Consequently, we propose a cross-modal ordinal pairwise loss to refine the CLIP feature space, where texts and images maintain both semantic alignment and ordering alignment.

Age Estimation Classification +2

Lightweight Vision Transformer with Bidirectional Interaction

1 code implementation NeurIPS 2023 Qihang Fan, Huaibo Huang, Xiaoqiang Zhou, Ran He

This paper proposes a Fully Adaptive Self-Attention (FASA) mechanism for vision transformer to model the local and global information as well as the bidirectional interaction between them in context-aware ways.

Rethinking Local Perception in Lightweight Vision Transformer

1 code implementation31 Mar 2023 Qihang Fan, Huaibo Huang, Jiyang Guan, Ran He

The combination of the AttnConv and vanilla attention which uses pooling to reduce FLOPs in CloFormer enables the model to perceive high-frequency and low-frequency information.

Image Classification object-detection +2

Uncertainty-Aware Source-Free Adaptive Image Super-Resolution with Wavelet Augmentation Transformer

no code implementations31 Mar 2023 Yuang Ai, Xiaoqiang Zhou, Huaibo Huang, Lei Zhang, Ran He

Unsupervised Domain Adaptation (UDA) can effectively address domain gap issues in real-world image Super-Resolution (SR) by accessing both the source and target data.

Image Super-Resolution Source-Free Domain Adaptation +1

Pluralistic Aging Diffusion Autoencoder

no code implementations ICCV 2023 Peipei Li, Rui Wang, Huaibo Huang, Ran He, Zhaofeng He

Face aging is an ill-posed problem because multiple plausible aging patterns may correspond to a given input.

Denoising

MSRA-SR: Image Super-resolution Transformer with Multi-scale Shared Representation Acquisition

no code implementations ICCV 2023 Xiaoqiang Zhou, Huaibo Huang, Ran He, Zilei Wang, Jie Hu, Tieniu Tan

In particular, self-attention with cross-scale matching and convolution filters with different kernel sizes are designed to exploit the multi-scale features in images.

Image Super-Resolution

Vision Transformer with Super Token Sampling

1 code implementation CVPR 2023 Huaibo Huang, Xiaoqiang Zhou, Jie Cao, Ran He, Tieniu Tan

STA decomposes vanilla global attention into multiplications of a sparse association map and a low-dimensional attention, leading to high efficiency in capturing global dependencies.

Semantic Segmentation Superpixels

Parallel Augmentation and Dual Enhancement for Occluded Person Re-identification

1 code implementation11 Oct 2022 Zi Wang, Huaibo Huang, Aihua Zheng, Chenglong Li, Ran He

To alleviate these two issues, we propose a simple yet effective method with Parallel Augmentation and Dual Enhancement (PADE), which is robust on both occluded and non-occluded data and does not require any auxiliary clues.

Person Re-Identification

Contrastive Attention Network with Dense Field Estimation for Face Completion

no code implementations20 Dec 2021 Xin Ma, Xiaoqiang Zhou, Huaibo Huang, Gengyun Jia, Zhenhua Chai, Xiaolin Wei

This multi-scale architecture is beneficial for the decoder to utilize discriminative representations learned from encoders into images.

Face Recognition Facial Inpainting

Toward Accurate and Reliable Iris Segmentation Using Uncertainty Learning

no code implementations20 Oct 2021 Jianze Wei, Huaibo Huang, Muyi Sun, Yunlong Wang, Min Ren, Ran He, Zhenan Sun

To make further efforts on accurate and reliable iris segmentation, we propose a bilateral self-attention module and design Bilateral Transformer (BiTrans) with hierarchical architecture by exploring spatial and visual relationships.

Iris Recognition Iris Segmentation +1

Causal Representation Learning for Context-Aware Face Transfer

no code implementations4 Oct 2021 Gege Gao, Huaibo Huang, Chaoyou Fu, Ran He

Human face synthesis involves transferring knowledge about the identity and identity-dependent face shape (IDFS) of a human face to target face images where the context (e. g., facial expressions, head poses, and other background factors) may change dramatically.

counterfactual Counterfactual Inference +4

Universal Face Restoration With Memorized Modulation

no code implementations3 Oct 2021 Jia Li, Huaibo Huang, Xiaofei Jia, Ran He

Blind face restoration (BFR) is a challenging problem because of the uncertainty of the degradation patterns.

Blind Face Restoration

Information Bottleneck Disentanglement for Identity Swapping

1 code implementation CVPR 2021 Gege Gao, Huaibo Huang, Chaoyou Fu, Zhaoyang Li, Ran He

In this work, we propose a novel information disentangling and swapping network, called InfoSwap, to extract the most expressive information for identity representation from a pre-trained face recognition model.

Disentanglement Face Recognition +1

Memory Oriented Transfer Learning for Semi-Supervised Image Deraining

no code implementations CVPR 2021 Huaibo Huang, Aijing Yu, Ran He

To address this issue, we propose a memory-oriented semi-supervised (MOSS) method which enables the network to explore and exploit the properties of rain streaks from both synthetic and real data.

Rain Removal Transfer Learning

Free-Form Image Inpainting via Contrastive Attention Network

no code implementations29 Oct 2020 Xin Ma, Xiaoqiang Zhou, Huaibo Huang, Zhenhua Chai, Xiaolin Wei, Ran He

It is difficult for encoders to capture such powerful representations under this complex situation.

Image Inpainting

DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition

1 code implementation20 Sep 2020 Chaoyou Fu, Xiang Wu, Yibo Hu, Huaibo Huang, Ran He

As a consequence, massive new diverse paired heterogeneous images with the same identity can be generated from noises.

Contrastive Learning Face Recognition +1

Cosmetic-Aware Makeup Cleanser

no code implementations20 Apr 2020 Yi Li, Huaibo Huang, Junchi Yu, Ran He, Tieniu Tan

Face verification aims at determining whether a pair of face images belongs to the same identity.

Face Parsing Face Verification +1

Informative Sample Mining Network for Multi-Domain Image-to-Image Translation

no code implementations ECCV 2020 Jie Cao, Huaibo Huang, Yi Li, Ran He, Zhenan Sun

The performance of multi-domain image-to-image translation has been significantly improved by recent progress in deep generative models.

Image-to-Image Translation Informativeness +1

LAMP-HQ: A Large-Scale Multi-Pose High-Quality Database and Benchmark for NIR-VIS Face Recognition

no code implementations17 Dec 2019 Aijing Yu, Haoxue Wu, Huaibo Huang, Zhen Lei, Ran He

A spectral conditional attention module is introduced to reduce the domain gap between NIR and VIS data and then improve the performance of NIR-VIS heterogeneous face recognition on various databases including the LAMP-HQ.

Attribute Face Recognition +1

Dual Variational Generation for Low Shot Heterogeneous Face Recognition

no code implementations NeurIPS 2019 Chaoyou Fu, Xiang Wu, Yibo Hu, Huaibo Huang, Ran He

Specifically, we first introduce a dual variational autoencoder to represent a joint distribution of paired heterogeneous images.

Face Recognition Heterogeneous Face Recognition

Biphasic Learning of GANs for High-Resolution Image-to-Image Translation

no code implementations14 Apr 2019 Jie Cao, Huaibo Huang, Yi Li, Jingtuo Liu, Ran He, Zhenan Sun

In this work, we present a novel training framework for GANs, namely biphasic learning, to achieve image-to-image translation in multiple visual domains at $1024^2$ resolution.

Image-to-Image Translation Mutual Information Estimation +2

UVA: A Universal Variational Framework for Continuous Age Analysis

no code implementations30 Mar 2019 Pei-Pei Li, Huaibo Huang, Yibo Hu, Xiang Wu, Ran He, Zhenan Sun

UVA is the first attempt to achieve facial age analysis tasks, including age translation, age generation and age estimation, in a universal framework.

Age Estimation MORPH +1

Dual Variational Generation for Low-Shot Heterogeneous Face Recognition

1 code implementation25 Mar 2019 Chaoyou Fu, Xiang Wu, Yibo Hu, Huaibo Huang, Ran He

Then, in order to ensure the identity consistency of the generated paired heterogeneous images, we impose a distribution alignment in the latent space and a pairwise identity preserving in the image space.

Face Recognition Heterogeneous Face Recognition

A Survey of Deep Facial Attribute Analysis

no code implementations26 Dec 2018 Xin Zheng, Yanqing Guo, Huaibo Huang, Yi Li, Ran He

Deep learning based facial attribute analysis consists of two basic sub-issues: facial attribute estimation (FAE), which recognizes whether facial attributes are present in given images, and facial attribute manipulation (FAM), which synthesizes or removes desired facial attributes.

Attribute

Arbitrary Talking Face Generation via Attentional Audio-Visual Coherence Learning

no code implementations17 Dec 2018 Hao Zhu, Huaibo Huang, Yi Li, Aihua Zheng, Ran He

Talking face generation aims to synthesize a face video with precise lip synchronization as well as a smooth transition of facial motion over the entire video via the given speech clip and facial image.

Talking Face Generation

Disentangled Variational Representation for Heterogeneous Face Recognition

no code implementations6 Sep 2018 Xiang Wu, Huaibo Huang, Vishal M. Patel, Ran He, Zhenan Sun

Visible (VIS) to near infrared (NIR) face matching is a challenging problem due to the significant domain discrepancy between the domains and a lack of sufficient data for training cross-modal matching algorithms.

Face Recognition Heterogeneous Face Recognition

IntroVAE: Introspective Variational Autoencoders for Photographic Image Synthesis

3 code implementations NeurIPS 2018 Huaibo Huang, Zhihang Li, Ran He, Zhenan Sun, Tieniu Tan

On the other hand, the inference model is encouraged to classify between the generated and real samples while the generator tries to fool it as GANs.

Image Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.