Search Results for author: Zhengzhong Tu

Found 24 papers, 12 papers with code

AIS 2024 Challenge on Video Quality Assessment of User-Generated Content: Methods and Results

no code implementations • 24 Apr 2024 • Marcos V. Conde, Saman Zadtootaghaj, Nabajeet Barman, Radu Timofte, Chenlong He, Qi Zheng, Ruoxi Zhu, Zhengzhong Tu, Haiqiang Wang, Xiangguang Chen, Wenhui Meng, Xiang Pan, Huiying Shi, Han Zhu, Xiaozhong Xu, Lei Sun, Zhenzhong Chen, Shan Liu, ZiCheng Zhang, HaoNing Wu, Yingjie Zhou, Chunyi Li, Xiaohong Liu, Weisi Lin, Guangtao Zhai, Wei Sun, Yuqin Cao, Yanwei Jiang, Jun Jia, Zhichao Zhang, Zijian Chen, Weixia Zhang, Xiongkuo Min, Steve Göring, Zihao Qi, Chen Feng

The performance of the top-5 submissions is reviewed and provided here as a survey of diverse deep models for efficient video quality assessment of user-generated content.

Video Quality Assessment Visual Question Answering (VQA)

Paper
Add Code

Light the Night: A Multi-Condition Diffusion Framework for Unpaired Low-Light Enhancement in Autonomous Driving

no code implementations • 7 Apr 2024 • Jinlong Li, Baolu Li, Zhengzhong Tu, Xinyu Liu, Qing Guo, Felix Juefei-Xu, Runsheng Xu, Hongkai Yu

Vision-centric perception systems for autonomous driving have gained considerable attention recently due to their cost-effectiveness and scalability, especially compared to LiDAR-based systems.

Autonomous Driving

Paper
Add Code

Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

no code implementations • 1 Apr 2024 • Kangfu Mei, Zhengzhong Tu, Mauricio Delbracio, Hossein Talebi, Vishal M. Patel, Peyman Milanfar

We study the scaling properties of latent diffusion models (LDMs) with an emphasis on their sampling efficiency.

Paper
Add Code

V2X-DGW: Domain Generalization for Multi-agent Perception under Adverse Weather Conditions

no code implementations • 17 Mar 2024 • Baolu Li, Jinlong Li, Xinyu Liu, Runsheng Xu, Zhengzhong Tu, Jiacheng Guo, Xiaopeng Li, Hongkai Yu

Current LiDAR-based Vehicle-to-Everything (V2X) multi-agent perception systems have shown the significant success on 3D object detection.

3D Object Detection Domain Generalization +2

Paper
Add Code

TIP: Text-Driven Image Processing with Semantic and Restoration Instructions

no code implementations • 18 Dec 2023 • Chenyang Qi, Zhengzhong Tu, Keren Ye, Mauricio Delbracio, Peyman Milanfar, Qifeng Chen, Hossein Talebi

Text-driven diffusion models have become increasingly popular for various image editing tasks, including inpainting, stylization, and object replacement.

Deblurring Denoising +2

Paper
Add Code

CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation

1 code implementation • 2 Oct 2023 • Kangfu Mei, Mauricio Delbracio, Hossein Talebi, Zhengzhong Tu, Vishal M. Patel, Peyman Milanfar

Our conditional-task learning and distillation approach outperforms previous distillation methods, achieving a new state-of-the-art in producing high-quality images with very few steps (e. g., 1-4) across multiple tasks, including super-resolution, text-guided image editing, and depth-to-image generation.

Image Enhancement Super-Resolution +1

Paper
Code

MULLER: Multilayer Laplacian Resizer for Vision

1 code implementation • ICCV 2023 • Zhengzhong Tu, Peyman Milanfar, Hossein Talebi

Specifically, we select a state-of-the-art vision Transformer, MaxViT, as the baseline, and show that, if trained with MULLER, MaxViT gains up to 0. 6% top-1 accuracy, and meanwhile enjoys 36% inference cost saving to achieve similar top-1 accuracy on ImageNet-1k, as compared to the standard training scheme.

Image Classification Image Quality Assessment +2

32,835

Paper
Code

V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle Cooperative Perception

1 code implementation • CVPR 2023 • Runsheng Xu, Xin Xia, Jinlong Li, Hanzhao Li, Shuo Zhang, Zhengzhong Tu, Zonglin Meng, Hao Xiang, Xiaoyu Dong, Rui Song, Hongkai Yu, Bolei Zhou, Jiaqi Ma

To facilitate the development of cooperative perception, we present V2V4Real, the first large-scale real-world multi-modal dataset for V2V perception.

3D Object Detection 3D Object Tracking +4

171

Paper
Code

CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers

2 code implementations • 5 Jul 2022 • Runsheng Xu, Zhengzhong Tu, Hao Xiang, Wei Shao, Bolei Zhou, Jiaqi Ma

The extensive experiments on the V2V perception dataset, OPV2V, demonstrate that CoBEVT achieves state-of-the-art performance for cooperative BEV semantic segmentation.

3D Object Detection Autonomous Driving +2

189

Paper
Code

Pik-Fix: Restoring and Colorizing Old Photos

1 code implementation • 4 May 2022 • Runsheng Xu, Zhengzhong Tu, Yuanqi Du, Xiaoyu Dong, Jinlong Li, Zibo Meng, Jiaqi Ma, Alan Bovik, Hongkai Yu

Our proposed framework consists of three modules: a restoration sub-network that conducts restoration from degradations, a similarity network that performs color histogram matching and color transfer, and a colorization subnet that learns to predict the chroma elements of images conditioned on chromatic reference signals.

Colorization

Paper
Code

MaxViT: Multi-Axis Vision Transformer

14 code implementations • 4 Apr 2022 • Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li

We also show that our proposed model expresses strong generative modeling capability on ImageNet, demonstrating the superior potential of MaxViT blocks as a universal vision module.

Ranked #1 on Object Detection on COCO 2017

Image Classification object-detection +1

29,792

Paper
Code

Perceptual Quality Assessment of UGC Gaming Videos

no code implementations • 31 Mar 2022 • Xiangxu Yu, Zhengzhong Tu, Neil Birkbeck, Yilin Wang, Balu Adsumilli, Alan C. Bovik

In recent years, with the vigorous development of the video game industry, the proportion of gaming videos on major video websites like YouTube has dramatically increased.

Video Quality Assessment Visual Question Answering (VQA)

Paper
Add Code

V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer

2 code implementations • 20 Mar 2022 • Runsheng Xu, Hao Xiang, Zhengzhong Tu, Xin Xia, Ming-Hsuan Yang, Jiaqi Ma

In this paper, we investigate the application of Vehicle-to-Everything (V2X) communication to improve the perception performance of autonomous vehicles.

Ranked #1 on 3D Object Detection on V2XSet

3D Object Detection Autonomous Vehicles +1

253

Paper
Code

ROMNet: Renovate the Old Memories

no code implementations • 5 Feb 2022 • Runsheng Xu, Zhengzhong Tu, Yuanqi Du, Xiaoyu Dong, Jinlong Li, Zibo Meng, Jiaqi Ma, Hongkai Yu

Renovating the memories in old photos is an intriguing research topic in computer vision fields.

Colorization

Paper
Add Code

MAXIM: Multi-Axis MLP for Image Processing

1 code implementation • CVPR 2022 • Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li

In this work, we present a multi-axis MLP based architecture called MAXIM, that can serve as an efficient and flexible general-purpose vision backbone for image processing tasks.

Ranked #1 on Deblurring on HIDE (trained on GOPRO)

Deblurring Image Deblurring +6

943

Paper
Code

FAVER: Blind Quality Prediction of Variable Frame Rate Videos

1 code implementation • 5 Jan 2022 • Qi Zheng, Zhengzhong Tu, Pavan C. Madhusudana, Xiaoyang Zeng, Alan C. Bovik, Yibo Fan

Video quality assessment (VQA) remains an important and challenging problem that affects many applications at the widest scales.

Cloud Computing Video Quality Assessment +1

Paper
Code

Predicting Eye Fixations Under Distortion Using Bayesian Observers

no code implementations • 6 Feb 2021 • Zhengzhong Tu

Visual attention is very an essential factor that affects how human perceives visual signals.

Blocking

Paper
Add Code

Regression or Classification? New Methods to Evaluate No-Reference Picture and Video Quality Models

no code implementations • 30 Jan 2021 • Zhengzhong Tu, Chia-Ju Chen, Li-Heng Chen, Yilin Wang, Neil Birkbeck, Balu Adsumilli, Alan C. Bovik

Video and image quality assessment has long been projected as a regression problem, which requires predicting a continuous quality score given an input stimulus.

General Classification Image Quality Assessment +2

Paper
Add Code

RAPIQUE: Rapid and Accurate Video Quality Prediction of User Generated Content

1 code implementation • 26 Jan 2021 • Zhengzhong Tu, Xiangxu Yu, Yilin Wang, Neil Birkbeck, Balu Adsumilli, Alan C. Bovik

However, these models are either incapable or inefficient for predicting the quality of complex and diverse UGC videos in practical applications.

Ranked #4 on Video Quality Assessment on LIVE Livestream

Video Quality Assessment

Paper
Code

Adaptive Debanding Filter

1 code implementation • 22 Sep 2020 • Zhengzhong Tu, Jessie Lin, Yilin Wang, Balu Adsumilli, Alan C. Bovik

Banding artifacts, which manifest as staircase-like color bands on pictures or video frames, is a common distortion caused by compression of low-textured smooth regions.

Quantization

Paper
Code

UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated Content

5 code implementations • 29 May 2020 • Zhengzhong Tu, Yilin Wang, Neil Birkbeck, Balu Adsumilli, Alan C. Bovik

Recent years have witnessed an explosion of user-generated content (UGC) videos shared and streamed over the Internet, thanks to the evolution of affordable and reliable consumer capture devices, and the tremendous popularity of social media platforms.

Ranked #11 on Video Quality Assessment on YouTube-UGC

Benchmarking feature selection +2

122

Paper
Code

BBAND Index: A No-Reference Banding Artifact Predictor

no code implementations • 27 Feb 2020 • Zhengzhong Tu, Jessie Lin, Yilin Wang, Balu Adsumilli, Alan C. Bovik

Banding artifact, or false contouring, is a common video compression impairment that tends to appear on large flat regions in encoded videos.

Video Compression

Paper
Add Code

A Comparative Evaluation of Temporal Pooling Methods for Blind Video Quality Assessment

no code implementations • 25 Feb 2020 • Zhengzhong Tu, Chia-Ju Chen, Li-Heng Chen, Neil Birkbeck, Balu Adsumilli, Alan C. Bovik

Many objective video quality assessment (VQA) algorithms include a key step of temporal pooling of frame-level quality scores.

Video Quality Assessment Visual Question Answering (VQA)

Paper
Add Code

Fitness Done Right: a Real-time Intelligent Personal Trainer for Exercise Correction

no code implementations • 30 Oct 2019 • Yun Chen, Yiyue Chen, Zhengzhong Tu

Finally, key values for key features of the two poses are computed correspondingly in the pose error detection part, which helps give correction advice.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.