Search Results for author: Xinfeng Zhang

Found 27 papers, 14 papers with code

NTIRE 2024 Challenge on Image Super-Resolution ($\times$4): Methods and Results

1 code implementation • 15 Apr 2024 • Zheng Chen, Zongwei Wu, Eduard Zamfir, Kai Zhang, Yulun Zhang, Radu Timofte, Xiaokang Yang, Hongyuan Yu, Cheng Wan, Yuxin Hong, Zhijuan Huang, Yajun Zou, Yuan Huang, Jiamin Lin, Bingnan Han, Xianyu Guan, Yongsheng Yu, Daoan Zhang, Xuanwu Yin, Kunlong Zuo, Jinhua Hao, Kai Zhao, Kun Yuan, Ming Sun, Chao Zhou, Hongyu An, Xinfeng Zhang, Zhiyuan Song, Ziyue Dong, Qing Zhao, Xiaogang Xu, Pengxu Wei, Zhi-chao Dou, Gui-ling Wang, Chih-Chung Hsu, Chia-Ming Lee, Yi-Shiuan Chou, Cansu Korkmaz, A. Murat Tekalp, Yubin Wei, Xiaole Yan, Binren Li, Haonan Chen, Siqi Zhang, Sihan Chen, Amogh Joshi, Nikhil Akalwadi, Sampada Malagi, Palani Yashaswini, Chaitra Desai, Ramesh Ashok Tabib, Ujwala Patil, Uma Mudenagudi, Anjali Sarvaiya, Pooja Choksy, Jagrit Joshi, Shubh Kawa, Kishor Upla, Sushrut Patwardhan, Raghavendra Ramachandra, Sadat Hossain, Geongi Park, S. M. Nadim Uddin, Hao Xu, Yanhui Guo, Aman Urumbekov, Xingzhuo Yan, Wei Hao, Minghan Fu, Isaac Orais, Samuel Smith, Ying Liu, Wangwang Jia, Qisheng Xu, Kele Xu, Weijun Yuan, Zhan Li, Wenqin Kuang, Ruijin Guan, Ruting Deng, Zhao Zhang, Bo wang, Suiyi Zhao, Yan Luo, Yanyan Wei, Asif Hussain Khan, Christian Micheloni, Niki Martinel

This paper reviews the NTIRE 2024 challenge on image super-resolution ($\times$4), highlighting the solutions proposed and the outcomes obtained.

Image Super-Resolution valid

Paper
Code

Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations

2 code implementations • 6 May 2023 • Yufeng Huang, Jiji Tang, Zhuo Chen, Rongsheng Zhang, Xinfeng Zhang, WeiJie Chen, Zeng Zhao, Zhou Zhao, Tangjie Lv, Zhipeng Hu, Wen Zhang

In this paper, we present an end-to-end framework Structure-CLIP, which integrates Scene Graph Knowledge (SGK) to enhance multi-modal structured representations.

Image-text matching Text Matching

Paper
Code

Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis Aggregation

1 code implementation • ICCV 2023 • Wenkang Shan, Zhenhua Liu, Xinfeng Zhang, Zhao Wang, Kai Han, Shanshe Wang, Siwei Ma, Wen Gao

On the other hand, JPMA is proposed to assemble multiple hypotheses generated by D3DP into a single 3D pose for practical use.

Ranked #2 on Multi-Hypotheses 3D Human Pose Estimation on Human3.6M

3D Pose Estimation Monocular 3D Human Pose Estimation +1

128

Paper
Code

Scene Matters: Model-based Deep Video Compression

no code implementations • ICCV 2023 • Lv Tang, Xinfeng Zhang, Gai Zhang, Xiaoqi Ma

Video compression has always been a popular research area, where many traditional and deep video compression methods have been proposed.

Video Compression

Paper
Add Code

Perceptual Video Coding for Machines via Satisfied Machine Ratio Modeling

1 code implementation • 13 Nov 2022 • Qi Zhang, Shanshe Wang, Xinfeng Zhang, Chuanmin Jia, Zhao Wang, Siwei Ma, Wen Gao

Each score is derived from machine perceptual differences between original and compressed images.

Image Classification object-detection +3

Paper
Code

Cross Modal Compression: Towards Human-comprehensible Semantic Compression

no code implementations • 6 Sep 2022 • Jiguo Li, Chuanmin Jia, Xinfeng Zhang, Siwei Ma, Wen Gao

With the recent advances in cross modal translation and generation, in this paper, we propose the cross modal compression~(CMC), a semantic compression framework for visual data, to transform the high redundant visual data~(such as image, video, etc.)

Feature Compression Video Compression

Paper
Add Code

STIP: A SpatioTemporal Information-Preserving and Perception-Augmented Model for High-Resolution Video Prediction

1 code implementation • 9 Jun 2022 • Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

To solve the information loss problem, the proposed model aims to preserve the spatiotemporal information for videos during the feature extraction and the state transitions, respectively.

Video Prediction

Paper
Code

STAU: A SpatioTemporal-Aware Unit for Video Prediction and Beyond

no code implementations • 20 Apr 2022 • Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

In this paper, we propose a SpatioTemporal-Aware Unit (STAU) for video prediction and beyond by exploring the significant spatiotemporal correlations in videos.

Action Recognition object-detection +2

Paper
Add Code

STRPM: A Spatiotemporal Residual Predictive Model for High-Resolution Video Prediction

1 code implementation • CVPR 2022 • Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

In this paper, we propose a Spatiotemporal Residual Predictive Model (STRPM) for high-resolution video prediction.

4k Video Prediction +1

Paper
Code

P-STMO: Pre-Trained Spatial Temporal Many-to-One Model for 3D Human Pose Estimation

1 code implementation • 15 Mar 2022 • Wenkang Shan, Zhenhua Liu, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

In Stage II, the pre-trained encoder is loaded to STMO model and fine-tuned.

Ranked #10 on Monocular 3D Human Pose Estimation on Human3.6M

Denoising Monocular 3D Human Pose Estimation

138

Paper
Code

MAU: A Motion-Aware Unit for Video Prediction and Beyond

1 code implementation • NeurIPS 2021 • Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Yan Ye, Xiang Xinguang, Wen Gao

The attention module aims to learn an attention map based on the correlations between the current spatial state and the historical spatial states.

Ranked #18 on Video Prediction on Moving MNIST

Action Recognition Video Prediction

Paper
Code

Improving Robustness and Accuracy via Relative Information Encoding in 3D Human Pose Estimation

1 code implementation • 29 Jul 2021 • Wenkang Shan, Haopeng Lu, Shanshe Wang, Xinfeng Zhang, Wen Gao

To alleviate these two problems, we propose a relative information encoding method that yields positional and temporal enhanced representations.

Ranked #13 on Monocular 3D Human Pose Estimation on Human3.6M

Monocular 3D Human Pose Estimation

Paper
Code

Sequential Hierarchical Learning with Distribution Transformation for Image Super-Resolution

no code implementations • 19 Jul 2020 • Yuqing Liu, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

Based on the observation, in this paper, we build a sequential hierarchical learning super-resolution network (SHSR) for effective image SR.

Ranked #9 on Image Super-Resolution on Manga109 - 3x upscaling

Image Restoration Image Super-Resolution +1

Paper
Add Code

Region-adaptive Texture Enhancement for Detailed Person Image Synthesis

1 code implementation • 26 May 2020 • Lingbo Yang, Pan Wang, Xinfeng Zhang, Shanshe Wang, Zhanning Gao, Peiran Ren, Xuansong Xie, Siwei Ma, Wen Gao

The ability to produce convincing textural details is essential for the fidelity of synthesized person images.

Ranked #4 on Pose Transfer on Deep-Fashion

Pose Transfer

Paper
Code

Towards Fine-grained Human Pose Transfer with Detail Replenishing Network

no code implementations • 26 May 2020 • Lingbo Yang, Pan Wang, Chang Liu, Zhanning Gao, Peiran Ren, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Xian-Sheng Hua, Wen Gao

Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality.

Pose Transfer Retrieval

Paper
Add Code

Towards Analysis-friendly Face Representation with Scalable Feature and Texture Compression

no code implementations • 21 Apr 2020 • Shurun Wang, Shiqi Wang, Wenhan Yang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

In particular, we study the feature and texture compression in a scalable coding framework, where the base layer serves as the deep learning feature and enhancement layer targets to perfectly reconstruct the texture.

Image Compression

Paper
Add Code

Universal Adversarial Perturbations Generative Network for Speaker Recognition

1 code implementation • 7 Apr 2020 • Jiguo Li, Xinfeng Zhang, Chuanmin Jia, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao

Attacking deep learning based biometric systems has drawn more and more attention with the wide deployment of fingerprint/face/speaker recognition systems, given the fact that the neural networks are vulnerable to the adversarial examples, which have been intentionally perturbed to remain almost imperceptible for human.

Speaker Recognition

Paper
Code

Direct Speech-to-image Translation

1 code implementation • 7 Apr 2020 • Jiguo Li, Xinfeng Zhang, Chuanmin Jia, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao

In this paper, we attempt to translate the speech signals into the image signals without the transcription stage.

Multimedia Sound Audio and Speech Processing

Paper
Code

Learning to fool the speaker recognition

1 code implementation • 7 Apr 2020 • Jiguo Li, Xinfeng Zhang, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao

Due to the widespread deployment of fingerprint/face/speaker recognition systems, attacking deep learning based biometric systems has drawn more and more attention.

Audio and Speech Processing Cryptography and Security Sound

Paper
Code

A Modified Perturbed Sampling Method for Local Interpretable Model-agnostic Explanation

no code implementations • 18 Feb 2020 • Sheng Shi, Xinfeng Zhang, Wei Fan

Explainability is a gateway between Artificial Intelligence and society as the current popular deep learning models are generally weak in explaining the reasoning process and prediction results.

Image Classification

Paper
Add Code

Explaining the Predictions of Any Image Classifier via Decision Trees

no code implementations • 4 Nov 2019 • Sheng Shi, Xinfeng Zhang, Wei Fan

Despite outstanding contribution to the significant progress of Artificial Intelligence (AI), deep learning models remain mostly black boxes, which are extremely weak in explainability of the reasoning process and prediction results.

Paper
Add Code

PgNN: Physics-guided Neural Network for Fourier Ptychographic Microscopy

no code implementations • 19 Sep 2019 • Yongbing Zhang, Yangzhe Liu, Xiu Li, Shaowei Jiang, Krishna Dixit, Xinfeng Zhang, Xiangyang Ji

Since the optimal parameters of the PgNN can be derived by minimizing the difference between the model-generated images and real captured angle-varied images corresponding to the same scene, the proposed PgNN can get rid of the problem of massive training data as in traditional supervised methods.

Paper
Add Code

Cascaded Parallel Filtering for Memory-Efficient Image-Based Localization

1 code implementation • ICCV 2019 • Wentao Cheng, Weisi Lin, Kan Chen, Xinfeng Zhang

Image-based localization (IBL) aims to estimate the 6DOF camera pose for a given query image.

Image-Based Localization Pose Estimation

Paper
Code

Image and Video Compression with Neural Networks: A Review

no code implementations • 7 Apr 2019 • Siwei Ma, Xinfeng Zhang, Chuanmin Jia, Zhenghui Zhao, Shiqi Wang, Shanshe Wang

Deep convolution neural network (CNN) which makes the neural network resurge in recent years and has achieved great success in both artificial intelligent and signal processing fields, also provides a novel and promising solution for image and video compression.

Video Compression

Paper
Add Code

Scalable Facial Image Compression with Deep Feature Reconstruction

no code implementations • 14 Mar 2019 • Shurun Wang, Shiqi Wang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

In this paper, we propose a scalable image compression scheme, including the base layer for feature representation and enhancement layer for texture representation.

Image Compression

Paper
Add Code

Anomaly Detection and Localization in Crowded Scenes by Motion-field Shape Description and Similarity-based Statistical Learning

no code implementations • 27 May 2018 • Xinfeng Zhang, Su Yang, Xinjian Zhang, Weishan Zhang, Jiulong Zhang

In crowded scenes, detection and localization of abnormal behaviors is challenging in that high-density people make object segmentation and tracking extremely difficult.

Anomaly Detection Clustering +1

Paper
Add Code

Spatial-Temporal Residue Network Based In-Loop Filter for Video Coding

no code implementations • 25 Sep 2017 • Chuanmin Jia, Shiqi Wang, Xinfeng Zhang, Shanshe Wang, Siwei Ma

Deep learning has demonstrated tremendous break through in the area of image/video processing.

Multimedia

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.