Search Results for author: Kaihao Zhang

Found 55 papers, 28 papers with code

Beyond Monocular Deraining: Stereo Image Deraining via Semantic Understanding

no code implementations • ECCV 2020 • Kaihao Zhang, Wenhan Luo, Wenqi Ren, Jingwen Wang Fang Zhao, Lin Ma , Hongdong Li

Moreover, even for single image based monocular deraining, many current methods fail to complete the task satisfactorily because they mostly rely on per pixel loss functions and ignoring semantic information.

Benchmarking Rain Removal +1

Paper
Add Code

Unsupervised Domain Adaptation with Noise Resistible Mutual-Training for Person Re-identification

no code implementations • ECCV 2020 • Fang Zhao, Shengcai Liao, Guo-Sen Xie, Jian Zhao, Kaihao Zhang, Ling Shao

On the other hand, mutual instance selection further selects reliable and informative instances for training according to the peer-confidence and relationship disagreement of the networks.

Clustering Person Re-Identification +2

Paper
Add Code

Authentic Emotion Mapping: Benchmarking Facial Expressions in Real News

no code implementations • 21 Apr 2024 • Qixuan Zhang, Zhifeng Wang, Yang Liu, Zhenyue Qin, Kaihao Zhang, Sabrina Caldwell, Tom Gedeon

In this paper, we present a novel benchmark for Emotion Recognition using facial landmarks extracted from realistic news videos.

Benchmarking Emotion Recognition

Paper
Add Code

Homography Guided Temporal Fusion for Road Line and Marking Segmentation

2 code implementations • ICCV 2023 • Shan Wang, Chuong Nguyen, Jiawei Liu, Kaihao Zhang, Wenhan Luo, Yanhao Zhang, Sundaram Muthu, Fahira Afzal Maken, Hongdong Li

Reliable segmentation of road lines and markings is critical to autonomous driving.

Autonomous Driving Segmentation

159

Paper
Code

OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models

1 code implementation • 16 Mar 2024 • Zhe Kong, Yong Zhang, Tianyu Yang, Tao Wang, Kaihao Zhang, Bizhu Wu, GuanYing Chen, Wei Liu, Wenhan Luo

We also observe that the initiation denoising timestep for noise blending is the key to identity preservation and layout.

Denoising Text-to-Image Generation

515

Paper
Code

AS-FIBA: Adaptive Selective Frequency-Injection for Backdoor Attack on Deep Face Restoration

no code implementations • 11 Mar 2024 • Zhenbo Song, Wenhao Gao, Kaihao Zhang, Wenhan Luo, Zhaoxin Fan, Jianfeng Lu

Extensive experiments demonstrate the efficacy of the degradation objective on state-of-the-art face restoration models.

Backdoor Attack

Paper
Add Code

Segmentation Guided Sparse Transformer for Under-Display Camera Image Restoration

no code implementations • 9 Mar 2024 • Jingyun Xue, Tao Wang, Jun Wang, Kaihao Zhang, Wenhan Luo, Wenqi Ren, Zikun Liu, Hyunhee Park, Xiaochun Cao

Specifically, we utilize sparse self-attention to filter out redundant information and noise, directing the model's attention to focus on the features more relevant to the degraded regions in need of reconstruction.

Image Restoration Instance Segmentation +1

Paper
Add Code

Adversarial Purification and Fine-tuning for Robust UDC Image Restoration

no code implementations • 21 Feb 2024 • Zhenbo Song, Zhenyuan Zhang, Kaihao Zhang, Wenhan Luo, Zhaoxin Fan, Jianfeng Lu

This study delves into the enhancement of Under-Display Camera (UDC) image restoration models, focusing on their robustness against adversarial attacks.

Image Restoration

Paper
Add Code

PromptRR: Diffusion Models as Prompt Generators for Single Image Reflection Removal

1 code implementation • 4 Feb 2024 • Tao Wang, Wanglong Lu, Kaihao Zhang, Wenhan Luo, Tae-Kyun Kim, Tong Lu, Hongdong Li, Ming-Hsuan Yang

For the prompt generation, we first propose a prompt pre-training strategy to train a frequency prompt encoder that encodes the ground-truth image into LF and HF prompts.

Reflection Removal

Paper
Code

LRDif: Diffusion Models for Under-Display Camera Emotion Recognition

no code implementations • 1 Feb 2024 • Zhifeng Wang, Kaihao Zhang, Ramesh Sankaranarayana

This study introduces LRDif, a novel diffusion-based framework designed specifically for facial expression recognition (FER) within the context of under-display cameras (UDC).

Emotion Recognition Facial Expression Recognition +1

Paper
Add Code

Dual Teacher Knowledge Distillation with Domain Alignment for Face Anti-spoofing

no code implementations • 2 Jan 2024 • Zhe Kong, Wentian Zhang, Tao Wang, Kaihao Zhang, Yuexiang Li, Xiaoying Tang, Wenhan Luo

In this paper, we propose a domain adversarial attack (DAA) method to mitigate the training instability problem by adding perturbations to the input images, which makes them indistinguishable across domains and enables domain alignment.

Adversarial Attack Face Anti-Spoofing +2

Paper
Add Code

Towards Real-World Blind Face Restoration with Generative Diffusion Prior

1 code implementation • 25 Dec 2023 • Xiaoxu Chen, Jingfan Tan, Tao Wang, Kaihao Zhang, Wenhan Luo, Xiaochun Cao

We propose BFRffusion which is thoughtfully designed to effectively extract features from low-quality face images and could restore realistic and faithful facial details with the generative prior of the pretrained Stable Diffusion.

Blind Face Restoration Privacy Preserving

Paper
Code

Deep Video Restoration for Under-Display Camera

no code implementations • 9 Sep 2023 • Xuanxi Chen, Tao Wang, Ziqian Shao, Kaihao Zhang, Wenhan Luo, Tong Lu, Zikun Liu, Tae-Kyun Kim, Hongdong Li

With the pipeline, we build the first large-scale UDC video restoration dataset called PexelsUDC, which includes two subsets named PexelsUDC-T and PexelsUDC-P corresponding to different displays for UDC.

Video Restoration

Paper
Add Code

MB-TaylorFormer: Multi-branch Efficient Transformer Expanded by Taylor Formula for Image Dehazing

1 code implementation • ICCV 2023 • Yuwei Qiu, Kaihao Zhang, Chenxi Wang, Wenhan Luo, Hongdong Li, Zhi Jin

To address this issue, we propose a new Transformer variant, which applies the Taylor expansion to approximate the softmax-attention and achieves linear computational complexity.

Image Dehazing

Paper
Code

Blind Face Restoration for Under-Display Camera via Dictionary Guided Transformer

no code implementations • 20 Aug 2023 • Jingfan Tan, Xiaoxu Chen, Tao Wang, Kaihao Zhang, Wenhan Luo, Xiaocun Cao

However, due to the characteristics of the display, images taken by UDC suffer from significant quality degradation.

Blind Face Restoration Image Restoration

Paper
Add Code

InterTracker: Discovering and Tracking General Objects Interacting with Hands in the Wild

no code implementations • 6 Aug 2023 • Yanyan Shao, Qi Ye, Wenhan Luo, Kaihao Zhang, Jiming Chen

Understanding human interaction with objects is an important research topic for embodied Artificial Intelligence and identifying the objects that humans are interacting with is a primary problem for interaction understanding.

Object Object Tracking

Paper
Add Code

Benchmarking Ultra-High-Definition Image Reflection Removal

1 code implementation • 1 Aug 2023 • Zhenyuan Zhang, Zhenbo Song, Kaihao Zhang, Wenhan Luo, Zhaoxin Fan, Jianfeng Lu

To the best of our knowledge, these two datasets are the first largest-scale UHD datasets for SIRR.

Benchmarking Image Restoration +1

Paper
Code

LLDiffusion: Learning Degradation Representations in Diffusion Models for Low-Light Image Enhancement

1 code implementation • 27 Jul 2023 • Tao Wang, Kaihao Zhang, Ziqian Shao, Wenhan Luo, Bjorn Stenger, Tae-Kyun Kim, Wei Liu, Hongdong Li

In this paper, we address this limitation by proposing a degradation-aware learning scheme for LLIE using diffusion models, which effectively integrates degradation and image priors into the diffusion process, resulting in improved image enhancement.

Image Generation Low-Light Image Enhancement

Paper
Code

HTNet for micro-expression recognition

1 code implementation • 27 Jul 2023 • Zhifeng Wang, Kaihao Zhang, Wenhan Luo, Ramesh Sankaranarayana

The transformer layer is used to focus on representing local minor muscle movement with local self-attention in each area.

Ranked #1 on Micro-Expression Recognition on CASME II

Facial Emotion Recognition Micro Expression Recognition +1

Paper
Code

Model Calibration in Dense Classification with Adaptive Label Perturbation

1 code implementation • ICCV 2023 • Jiawei Liu, Changkun Ye, Shan Wang, Ruikai Cui, Jing Zhang, Kaihao Zhang, Nick Barnes

To improve model calibration, we propose Adaptive Stochastic Label Perturbation (ASLP) which learns a unique label perturbation level for each training image.

Binary Classification Classification +1

Paper
Code

GridFormer: Residual Dense Transformer with Grid Structure for Image Restoration in Adverse Weather Conditions

no code implementations • 29 May 2023 • Tao Wang, Kaihao Zhang, Ziqian Shao, Wenhan Luo, Bjorn Stenger, Tong Lu, Tae-Kyun Kim, Wei Liu, Hongdong Li

Second, we introduce a residual dense transformer block (RDTB) as the final GridFormer layer.

Image Restoration Rain Removal

Paper
Add Code

Towards an Effective and Efficient Transformer for Rain-by-snow Weather Removal

1 code implementation • 6 Apr 2023 • Tao Gao, Yuanbo Wen, Kaihao Zhang, Peng Cheng, Ting Chen

Rain-by-snow weather removal is a specialized task in weather-degraded image restoration aiming to eliminate coexisting rain streaks and snow particles.

Image Restoration

Paper
Code

F&F Attack: Adversarial Attack against Multiple Object Trackers by Inducing False Negatives and False Positives

no code implementations • ICCV 2023 • Tao Zhou, Qi Ye, Wenhan Luo, Kaihao Zhang, Zhiguo Shi, Jiming Chen

Multi-object tracking (MOT) aims to build moving trajectories for number-agnostic objects.

Adversarial Attack Multi-Object Tracking +1

Paper
Add Code

Robust Single Image Reflection Removal Against Adversarial Attacks

1 code implementation • CVPR 2023 • Zhenbo Song, Zhenyuan Zhang, Kaihao Zhang, Wenhan Luo, Zhaoxin Fan, Wenqi Ren, Jianfeng Lu

This paper addresses the problem of robust deep single-image reflection removal (SIRR) against adversarial attacks.

Ranked #2 on Reflection Removal on Real20

Reflection Removal

Paper
Code

Restoring Vision in Hazy Weather with Hierarchical Contrastive Learning

no code implementations • 22 Dec 2022 • Tao Wang, Guangpin Tao, Wanglong Lu, Kaihao Zhang, Wenhan Luo, Xiaoqin Zhang, Tong Lu

HCD consists of a hierarchical dehazing network (HDN) and a novel hierarchical contrastive loss (HCL).

Contrastive Learning Image Dehazing +3

Paper
Add Code

Ultra-High-Definition Low-Light Image Enhancement: A Benchmark and Transformer-Based Method

1 code implementation • 22 Dec 2022 • Tao Wang, Kaihao Zhang, Tianrun Shen, Wenhan Luo, Bjorn Stenger, Tong Lu

In this paper, we consider the task of low-light image enhancement (LLIE) and introduce a large-scale database consisting of images at 4K and 8K resolution.

4k 8k +3

141

Paper
Code

A Survey of Deep Face Restoration: Denoise, Super-Resolution, Deblur, Artifact Removal

1 code implementation • 5 Nov 2022 • Tao Wang, Kaihao Zhang, Xuanxi Chen, Wenhan Luo, Jiankang Deng, Tong Lu, Xiaochun Cao, Wei Liu, Hongdong Li, Stefanos Zafeiriou

Second, we discuss the challenges of face restoration.

Image Restoration Super-Resolution

367

Paper
Code

Generalised Co-Salient Object Detection

no code implementations • 20 Aug 2022 • Jiawei Liu, Jing Zhang, Ruikai Cui, Kaihao Zhang, Weihao Li, Nick Barnes

We propose a new setting that relaxes an assumption in the conventional Co-Salient Object Detection (CoSOD) setting by allowing the presence of "noisy images" which do not show the shared co-salient object.

Co-Salient Object Detection Object +3

Paper
Add Code

Multi-Prior Learning via Neural Architecture Search for Blind Face Restoration

1 code implementation • 28 Jun 2022 • Yanjiang Yu, Puyang Zhang, Kaihao Zhang, Wenhan Luo, Changsheng Li, Ye Yuan, Guoren Wang

To this end, we propose a Face Restoration Searching Network (FRSNet) to adaptively search the suitable feature extraction architecture within our specified search space, which can directly contribute to the restoration quality.

Blind Face Restoration Neural Architecture Search

Paper
Code

Vicinity Vision Transformer

1 code implementation • 21 Jun 2022 • Weixuan Sun, Zhen Qin, Hui Deng, Jianyuan Wang, Yi Zhang, Kaihao Zhang, Nick Barnes, Stan Birchfield, Lingpeng Kong, Yiran Zhong

Based on this observation, we present a Vicinity Attention that introduces a locality bias to vision transformers with linear complexity.

Image Classification

Paper
Code

Blind Face Restoration: Benchmark Datasets and a Baseline Model

2 code implementations • 8 Jun 2022 • Puyang Zhang, Kaihao Zhang, Wenhan Luo, Changsheng Li, Guoren Wang

To address this problem, we first synthesize two blind face restoration benchmark datasets called EDFace-Celeb-1M (BFR128) and EDFace-Celeb-150K (BFR512).

Blind Face Restoration

Paper
Code

From heavy rain removal to detail restoration: A faster and better network

1 code implementation • 7 May 2022 • Yuanbo Wen, Tao Gao, Jing Zhang, Kaihao Zhang, Ting Chen

This approach comprises two key modules, a rain streaks removal network (R$^2$Net) focusing on accurate rain removal, and a details reconstruction network (DRNet) designed to recover the textural details of rain-free images.

Rain Removal

Paper
Code

Deep Image Deblurring: A Survey

no code implementations • 26 Jan 2022 • Kaihao Zhang, Wenqi Ren, Wenhan Luo, Wei-Sheng Lai, Bjorn Stenger, Ming-Hsuan Yang, Hongdong Li

Image deblurring is a classic problem in low-level computer vision with the aim to recover a sharp image from a blurred input image.

Deblurring Image Deblurring

Paper
Add Code

MC-Blur: A Comprehensive Benchmark for Image Deblurring

2 code implementations • 1 Dec 2021 • Kaihao Zhang, Tao Wang, Wenhan Luo, Boheng Chen, Wenqi Ren, Bjorn Stenger, Wei Liu, Hongdong Li, Ming-Hsuan Yang

Blur artifacts can seriously degrade the visual quality of images, and numerous deblurring methods have been proposed for specific scenarios.

Benchmarking Deblurring +1

141

Paper
Code

Dense Uncertainty Estimation

1 code implementation • 13 Oct 2021 • Jing Zhang, Yuchao Dai, Mochu Xiang, Deng-Ping Fan, Peyman Moghadam, Mingyi He, Christian Walder, Kaihao Zhang, Mehrtash Harandi, Nick Barnes

Deep neural networks can be roughly divided into deterministic neural networks and stochastic neural networks. The former is usually trained to achieve a mapping from input space to output space via maximum likelihood estimation for the weights, which leads to deterministic predictions during testing.

Decision Making

Paper
Code

EDFace-Celeb-1M: Benchmarking Face Hallucination with a Million-scale Dataset

1 code implementation • 11 Oct 2021 • Kaihao Zhang, Dongxu Li, Wenhan Luo, Jingyu Liu, Jiankang Deng, Wei Liu, Stefanos Zafeiriou

It is thus unclear how these algorithms perform on public face hallucination datasets.

Ranked #1 on Image Super-Resolution on WLFW

Benchmarking Face Hallucination +2

Paper
Code

T-Net: Deep Stacked Scale-Iteration Network for Image Dehazing

no code implementations • 5 Jun 2021 • Lirong Zheng, Yanshan Li, Kaihao Zhang, Wenhan Luo

In order to reduce network parameters, the intra-stage recursive computation of ResNet is adopted in our Stack T-Net.

Image Dehazing

Paper
Add Code

Blind Motion Deblurring Super-Resolution: When Dynamic Spatio-Temporal Learning Meets Static Image Understanding

no code implementations • 27 May 2021 • Wenjia Niu, Kaihao Zhang, Wenhan Luo, Yiran Zhong

Single-image super-resolution (SR) and multi-frame SR are two ways to super resolve low-resolution images.

Deblurring Image Deblurring +1

Paper
Add Code

Beyond Monocular Deraining: Parallel Stereo Deraining Network Via Semantic Prior

no code implementations • 9 May 2021 • Kaihao Zhang, Wenhan Luo, Yanjiang Yu, Wenqi Ren, Fang Zhao, Changsheng Li, Lin Ma, Wei Liu, Hongdong Li

We first use a coarse deraining network to reduce the rain streaks on the input images, and then adopt a pre-trained semantic segmentation network to extract semantic features from the coarse derained image.

Benchmarking Rain Removal +1

Paper
Add Code

Deep Two-View Structure-from-Motion Revisited

1 code implementation • CVPR 2021 • Jianyuan Wang, Yiran Zhong, Yuchao Dai, Stan Birchfield, Kaihao Zhang, Nikolai Smolyanskiy, Hongdong Li

Two-view structure-from-motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM.

Ranked #25 on Monocular Depth Estimation on KITTI Eigen split

3D Reconstruction Monocular Depth Estimation +3

175

Paper
Code

Enhanced Spatio-Temporal Interaction Learning for Video Deraining: A Faster and Better Framework

1 code implementation • 23 Mar 2021 • Kaihao Zhang, Dongxu Li, Wenhan Luo, Wenqi Ren, Wei Liu

Video deraining is an important task in computer vision as the unwanted rain hampers the visibility of videos and deteriorates the robustness of most outdoor vision systems.

Rain Removal

Paper
Code

Deep Dense Multi-scale Network for Snow Removal Using Semantic and Geometric Priors

no code implementations • 21 Mar 2021 • Kaihao Zhang, Rongqing Li, Yanjiang Yu, Wenhan Luo, Changsheng Li, Hongdong Li

Images captured in snowy days suffer from noticeable degradation of scene visibility, which degenerates the performance of current vision-based intelligent systems.

Image Restoration Snow Removal

Paper
Add Code

Dual Attention-in-Attention Model for Joint Rain Streak and Raindrop Removal

no code implementations • 12 Mar 2021 • Kaihao Zhang, Dongxu Li, Wenhan Luo, Wenqi Ren

In addition, to further refine the result, a Differential-driven Dual Attention-in-Attention Model (D-DAiAM) is proposed with a "heavy-to-light" scheme to remove rain via addressing the unsatisfying deraining regions.

Rain Removal

Paper
Add Code

ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring

no code implementations • CVPR 2021 • Dongxu Li, Chenchen Xu, Kaihao Zhang, Xin Yu, Yiran Zhong, Wenqi Ren, Hanna Suominen, Hongdong Li

Video deblurring models exploit consecutive frames to remove blurs from camera shakes and object motions.

Deblurring

Paper
Add Code

Benchmarking Ultra-High-Definition Image Super-Resolution

no code implementations • ICCV 2021 • Kaihao Zhang, Dongxu Li, Wenhan Luo, Wenqi Ren, Bjorn Stenger, Wei Liu, Hongdong Li, Ming-Hsuan Yang

Increasingly, modern mobile devices allow capturing images at Ultra-High-Definition (UHD) resolution, which includes 4K and 8K images.

4k 8k +3

Paper
Add Code

Pyramid Architecture Search for Real-Time Image Deblurring

no code implementations • ICCV 2021 • Xiaobin Hu, Wenqi Ren, Kaicheng Yu, Kaihao Zhang, Xiaochun Cao, Wei Liu, Bjoern Menze

Multi-scale and multi-patch deep models have been shown effective in removing blurs of dynamic scenes.

Binarization Deblurring +2

Paper
Add Code

Displacement-Invariant Cost Computation for Efficient Stereo Matching

no code implementations • 1 Dec 2020 • Yiran Zhong, Charles Loop, Wonmin Byeon, Stan Birchfield, Yuchao Dai, Kaihao Zhang, Alexey Kamenev, Thomas Breuel, Hongdong Li, Jan Kautz

A common way to speed up the computation is to downsample the feature volume, but this loses high-frequency details.

Autonomous Driving Stereo Matching

Paper
Add Code

Human Parsing Based Texture Transfer from Single Image to 3D Human via Cross-View Consistency

1 code implementation • NeurIPS 2020 • Fang Zhao, Shengcai Liao, Kaihao Zhang, Ling Shao

This paper proposes a human parsing based texture transfer model via cross-view consistency learning to generate the texture of 3D human body from a single image.

Human Parsing Image to 3D +2

Paper
Code

Displacement-Invariant Matching Cost Learning for Accurate Optical Flow Estimation

3 code implementations • NeurIPS 2020 • Jianyuan Wang, Yiran Zhong, Yuchao Dai, Kaihao Zhang, Pan Ji, Hongdong Li

Learning matching costs has been shown to be critical to the success of the state-of-the-art deep stereo matching methods, in which 3D convolutions are applied on a 4D feature volume to learn a 3D cost volume.

Optical Flow Estimation Stereo Matching

145

Paper
Code

TSPNet: Hierarchical Feature Learning via Temporal Semantic Pyramid for Sign Language Translation

2 code implementations • NeurIPS 2020 • Dongxu Li, Chenchen Xu, Xin Yu, Kaihao Zhang, Ben Swift, Hanna Suominen, Hongdong Li

Sign language translation (SLT) aims to interpret sign video sequences into text-based natural language sentences.

Sign Language Recognition Sign Language Translation +3

747

Paper
Code

Single Image Super-Resolution via a Holistic Attention Network

2 code implementations • ECCV 2020 • Ben Niu, Weilei Wen, Wenqi Ren, Xiangde Zhang, Lianping Yang, Shuzhen Wang, Kaihao Zhang, Xiaochun Cao, Haifeng Shen

Informative features play a crucial role in the single image super-resolution task.

Ranked #2 on Image Super-Resolution on Urban100 - 8x upscaling

Image Super-Resolution

166

Paper
Code

Deblurring by Realistic Blurring

1 code implementation • CVPR 2020 • Kaihao Zhang, Wenhan Luo, Yiran Zhong, Lin Ma, Bjorn Stenger, Wei Liu, Hongdong Li

To address this problem, we propose a new method which combines two GAN models, i. e., a learning-to-Blur GAN (BGAN) and learning-to-DeBlur GAN (DBGAN), in order to learn a better model for image deblurring by primarily learning how to blur images.

Ranked #17 on Deblurring on HIDE (trained on GOPRO)

Deblurring Image Deblurring

Paper
Code

STH: Spatio-Temporal Hybrid Convolution for Efficient Action Recognition

no code implementations • 18 Mar 2020 • Xu Li, Jingwen Wang, Lin Ma, Kaihao Zhang, Fengzong Lian, Zhanhui Kang, Jinjun Wang

Such a design enables efficient spatio-temporal modeling and maintains a small model scale.

Action Recognition

Paper
Add Code

Learning Joint Gait Representation via Quintuplet Loss Minimization

no code implementations • CVPR 2019 • Kaihao Zhang, Wenhan Luo, Lin Ma, Wei Liu, Hongdong Li

Gait recognition is an important biometric method popularly used in video surveillance, where the task is to identify people at a distance by their walking patterns from video sequences.

Gait Recognition

Paper
Add Code

Adversarial Spatio-Temporal Learning for Video Deblurring

1 code implementation • 28 Mar 2018 • Kaihao Zhang, Wenhan Luo, Yiran Zhong, Lin Ma, Wei Liu, Hongdong Li

To tackle the second challenge, we leverage the developed DBLRNet as a generator in the GAN (generative adversarial network) architecture, and employ a content loss in addition to an adversarial loss for efficient adversarial training.

Deblurring Generative Adversarial Network

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.