Search Results for author: Zhengzhong Tu

Found 24 papers, 12 papers with code

Light the Night: A Multi-Condition Diffusion Framework for Unpaired Low-Light Enhancement in Autonomous Driving

no code implementations7 Apr 2024 Jinlong Li, Baolu Li, Zhengzhong Tu, Xinyu Liu, Qing Guo, Felix Juefei-Xu, Runsheng Xu, Hongkai Yu

Vision-centric perception systems for autonomous driving have gained considerable attention recently due to their cost-effectiveness and scalability, especially compared to LiDAR-based systems.

Autonomous Driving

Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

no code implementations1 Apr 2024 Kangfu Mei, Zhengzhong Tu, Mauricio Delbracio, Hossein Talebi, Vishal M. Patel, Peyman Milanfar

We study the scaling properties of latent diffusion models (LDMs) with an emphasis on their sampling efficiency.

TIP: Text-Driven Image Processing with Semantic and Restoration Instructions

no code implementations18 Dec 2023 Chenyang Qi, Zhengzhong Tu, Keren Ye, Mauricio Delbracio, Peyman Milanfar, Qifeng Chen, Hossein Talebi

Text-driven diffusion models have become increasingly popular for various image editing tasks, including inpainting, stylization, and object replacement.

Deblurring Denoising +2

CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation

1 code implementation2 Oct 2023 Kangfu Mei, Mauricio Delbracio, Hossein Talebi, Zhengzhong Tu, Vishal M. Patel, Peyman Milanfar

Our conditional-task learning and distillation approach outperforms previous distillation methods, achieving a new state-of-the-art in producing high-quality images with very few steps (e. g., 1-4) across multiple tasks, including super-resolution, text-guided image editing, and depth-to-image generation.

Image Enhancement Super-Resolution +1

MULLER: Multilayer Laplacian Resizer for Vision

1 code implementation ICCV 2023 Zhengzhong Tu, Peyman Milanfar, Hossein Talebi

Specifically, we select a state-of-the-art vision Transformer, MaxViT, as the baseline, and show that, if trained with MULLER, MaxViT gains up to 0. 6% top-1 accuracy, and meanwhile enjoys 36% inference cost saving to achieve similar top-1 accuracy on ImageNet-1k, as compared to the standard training scheme.

Image Classification Image Quality Assessment +2

CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers

2 code implementations5 Jul 2022 Runsheng Xu, Zhengzhong Tu, Hao Xiang, Wei Shao, Bolei Zhou, Jiaqi Ma

The extensive experiments on the V2V perception dataset, OPV2V, demonstrate that CoBEVT achieves state-of-the-art performance for cooperative BEV semantic segmentation.

3D Object Detection Autonomous Driving +2

Pik-Fix: Restoring and Colorizing Old Photos

1 code implementation4 May 2022 Runsheng Xu, Zhengzhong Tu, Yuanqi Du, Xiaoyu Dong, Jinlong Li, Zibo Meng, Jiaqi Ma, Alan Bovik, Hongkai Yu

Our proposed framework consists of three modules: a restoration sub-network that conducts restoration from degradations, a similarity network that performs color histogram matching and color transfer, and a colorization subnet that learns to predict the chroma elements of images conditioned on chromatic reference signals.

Colorization

MaxViT: Multi-Axis Vision Transformer

14 code implementations4 Apr 2022 Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li

We also show that our proposed model expresses strong generative modeling capability on ImageNet, demonstrating the superior potential of MaxViT blocks as a universal vision module.

Image Classification object-detection +1

Perceptual Quality Assessment of UGC Gaming Videos

no code implementations31 Mar 2022 Xiangxu Yu, Zhengzhong Tu, Neil Birkbeck, Yilin Wang, Balu Adsumilli, Alan C. Bovik

In recent years, with the vigorous development of the video game industry, the proportion of gaming videos on major video websites like YouTube has dramatically increased.

Video Quality Assessment Visual Question Answering (VQA)

V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer

2 code implementations20 Mar 2022 Runsheng Xu, Hao Xiang, Zhengzhong Tu, Xin Xia, Ming-Hsuan Yang, Jiaqi Ma

In this paper, we investigate the application of Vehicle-to-Everything (V2X) communication to improve the perception performance of autonomous vehicles.

3D Object Detection Autonomous Vehicles +1

ROMNet: Renovate the Old Memories

no code implementations5 Feb 2022 Runsheng Xu, Zhengzhong Tu, Yuanqi Du, Xiaoyu Dong, Jinlong Li, Zibo Meng, Jiaqi Ma, Hongkai Yu

Renovating the memories in old photos is an intriguing research topic in computer vision fields.

Colorization

MAXIM: Multi-Axis MLP for Image Processing

1 code implementation CVPR 2022 Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li

In this work, we present a multi-axis MLP based architecture called MAXIM, that can serve as an efficient and flexible general-purpose vision backbone for image processing tasks.

Deblurring Image Deblurring +6

FAVER: Blind Quality Prediction of Variable Frame Rate Videos

1 code implementation5 Jan 2022 Qi Zheng, Zhengzhong Tu, Pavan C. Madhusudana, Xiaoyang Zeng, Alan C. Bovik, Yibo Fan

Video quality assessment (VQA) remains an important and challenging problem that affects many applications at the widest scales.

Cloud Computing Video Quality Assessment +1

Predicting Eye Fixations Under Distortion Using Bayesian Observers

no code implementations6 Feb 2021 Zhengzhong Tu

Visual attention is very an essential factor that affects how human perceives visual signals.

Blocking

Regression or Classification? New Methods to Evaluate No-Reference Picture and Video Quality Models

no code implementations30 Jan 2021 Zhengzhong Tu, Chia-Ju Chen, Li-Heng Chen, Yilin Wang, Neil Birkbeck, Balu Adsumilli, Alan C. Bovik

Video and image quality assessment has long been projected as a regression problem, which requires predicting a continuous quality score given an input stimulus.

General Classification Image Quality Assessment +2

RAPIQUE: Rapid and Accurate Video Quality Prediction of User Generated Content

1 code implementation26 Jan 2021 Zhengzhong Tu, Xiangxu Yu, Yilin Wang, Neil Birkbeck, Balu Adsumilli, Alan C. Bovik

However, these models are either incapable or inefficient for predicting the quality of complex and diverse UGC videos in practical applications.

Video Quality Assessment

Adaptive Debanding Filter

1 code implementation22 Sep 2020 Zhengzhong Tu, Jessie Lin, Yilin Wang, Balu Adsumilli, Alan C. Bovik

Banding artifacts, which manifest as staircase-like color bands on pictures or video frames, is a common distortion caused by compression of low-textured smooth regions.

Quantization

UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated Content

5 code implementations29 May 2020 Zhengzhong Tu, Yilin Wang, Neil Birkbeck, Balu Adsumilli, Alan C. Bovik

Recent years have witnessed an explosion of user-generated content (UGC) videos shared and streamed over the Internet, thanks to the evolution of affordable and reliable consumer capture devices, and the tremendous popularity of social media platforms.

Benchmarking feature selection +2

BBAND Index: A No-Reference Banding Artifact Predictor

no code implementations27 Feb 2020 Zhengzhong Tu, Jessie Lin, Yilin Wang, Balu Adsumilli, Alan C. Bovik

Banding artifact, or false contouring, is a common video compression impairment that tends to appear on large flat regions in encoded videos.

Video Compression

Fitness Done Right: a Real-time Intelligent Personal Trainer for Exercise Correction

no code implementations30 Oct 2019 Yun Chen, Yiyue Chen, Zhengzhong Tu

Finally, key values for key features of the two poses are computed correspondingly in the pose error detection part, which helps give correction advice.

Cannot find the paper you are looking for? You can Submit a new open access paper.