Search Results for author: Siwei Ma

Found 56 papers, 23 papers with code

Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer

no code implementations6 Mar 2024 Naifu Xue, Qi Mao, Zijian Wang, Yuan Zhang, Siwei Ma

Recent progress in generative compression technology has significantly improved the perceptual quality of compressed data.

Image Generation

SPC-NeRF: Spatial Predictive Compression for Voxel Based Radiance Field

no code implementations26 Feb 2024 Zetian Song, Wenhong Duan, Yuhuai Zhang, Shiqi Wang, Siwei Ma, Wen Gao

Representing the Neural Radiance Field (NeRF) with the explicit voxel grid (EVG) is a promising direction for improving NeRFs.

Image Compression Neural Network Compression +1

A Neural-network Enhanced Video Coding Framework beyond ECM

no code implementations13 Feb 2024 Yanchen Zhao, Wenxuan He, Chuanmin Jia, Qizhe Wang, Junru Li, Yue Li, Chaoyi Lin, Kai Zhang, Li Zhang, Siwei Ma

In this paper, a hybrid video compression framework is proposed that serves as a demonstrative showcase of deep learning-based approaches extending beyond the confines of traditional coding methodologies.

Video Compression

Scalable Face Image Coding via StyleGAN Prior: Towards Compression for Human-Machine Collaborative Vision

no code implementations25 Dec 2023 Qi Mao, Chongyu Wang, Meng Wang, Shiqi Wang, Ruijie Chen, Libiao Jin, Siwei Ma

The accelerated proliferation of visual content and the rapid development of machine vision technologies bring significant challenges in delivering visual data on a gigantic scale, which shall be effectively represented to satisfy both human and machine requirements.

Image Compression

Spatial-Temporal Transformer based Video Compression Framework

no code implementations21 Sep 2023 Yanbo Gao, Wenjia Huang, Shuai Li, Hui Yuan, Mao Ye, Siwei Ma

Similar as the traditional video coding, LVC inherits motion estimation/compensation, residual coding and other modules, all of which are implemented with neural networks (NNs).

Motion Estimation Video Compression

Extreme Image Compression using Fine-tuned VQGANs

no code implementations17 Jul 2023 Qi Mao, Tinghan Yang, Yinuo Zhang, Zijian Wang, Meng Wang, Shiqi Wang, Siwei Ma

Remarkably, even with the loss of up to $20\%$ of indices, the images can be effectively restored with minimal perceptual loss.

Image Compression Quantization

SpikeCodec: An End-to-end Learned Compression Framework for Spiking Camera

no code implementations25 Jun 2023 Kexiang Feng, Chuanmin Jia, Siwei Ma, Wen Gao

Recently, the bio-inspired spike camera with continuous motion recording capability has attracted tremendous attention due to its ultra high temporal resolution imaging characteristic.

Data Compression

Optimization-Inspired Cross-Attention Transformer for Compressive Sensing

1 code implementation CVPR 2023 Jiechong Song, Chong Mou, Shiqi Wang, Siwei Ma, Jian Zhang

And, PGCA block achieves an enhanced information interaction, which introduces the inertia force into the gradient descent step through a cross attention block.

Compressive Sensing

Machine Perception-Driven Image Compression: A Layered Generative Approach

no code implementations14 Apr 2023 Yuefeng Zhang, Chuanmin Jia, Jiannhui Chang, Siwei Ma

In this age of information, images are a critical medium for storing and transmitting information.

Image Compression

Learning to Compress Unmanned Aerial Vehicle (UAV) Captured Video: Benchmark and Analysis

no code implementations15 Jan 2023 Chuanmin Jia, Feng Ye, Huifang Sun, Siwei Ma, Wen Gao

During the past decade, the Unmanned-Aerial-Vehicles (UAVs) have attracted increasing attention due to their flexible, extensive, and dynamic space-sensing capabilities.

Video Compression

Cross Modal Compression: Towards Human-comprehensible Semantic Compression

no code implementations6 Sep 2022 Jiguo Li, Chuanmin Jia, Xinfeng Zhang, Siwei Ma, Wen Gao

With the recent advances in cross modal translation and generation, in this paper, we propose the cross modal compression~(CMC), a semantic compression framework for visual data, to transform the high redundant visual data~(such as image, video, etc.)

Feature Compression Video Compression

Towards Hybrid-Optimization Video Coding

no code implementations12 Jul 2022 Shuai Huo, Dong Liu, Li Li, Siwei Ma, Feng Wu, Wen Gao

Our idea is to provide multiple discrete starting points in the global space and optimize the local optimum around each point by numerical algorithm efficiently.

STIP: A SpatioTemporal Information-Preserving and Perception-Augmented Model for High-Resolution Video Prediction

1 code implementation9 Jun 2022 Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

To solve the information loss problem, the proposed model aims to preserve the spatiotemporal information for videos during the feature extraction and the state transitions, respectively.

Video Prediction

Hierarchical Similarity Learning for Aliasing Suppression Image Super-Resolution

no code implementations7 Jun 2022 Yuqing Liu, Qi Jia, Jian Zhang, Xin Fan, Shanshe Wang, Siwei Ma, Wen Gao

As a highly ill-posed issue, single image super-resolution (SISR) has been widely investigated in recent years.

Image Super-Resolution

Learning Weighting Map for Bit-Depth Expansion within a Rational Range

1 code implementation26 Apr 2022 Yuqing Liu, Qi Jia, Jian Zhang, Xin Fan, Shanshe Wang, Siwei Ma, Wen Gao

Existing BDE methods have no unified solution for various BDE situations, and directly learn a mapping for each pixel from LBD image to the desired value in HBD image, which may change the given high-order bits and lead to a huge deviation from the ground truth.

SSIM

STAU: A SpatioTemporal-Aware Unit for Video Prediction and Beyond

no code implementations20 Apr 2022 Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

In this paper, we propose a SpatioTemporal-Aware Unit (STAU) for video prediction and beyond by exploring the significant spatiotemporal correlations in videos.

Action Recognition object-detection +2

Cross-SRN: Structure-Preserving Super-Resolution Network with Cross Convolution

no code implementations5 Jan 2022 Yuqing Liu, Qi Jia, Xin Fan, Shanshe Wang, Siwei Ma, Wen Gao

It is challenging to restore low-resolution (LR) images to super-resolution (SR) images with correct and clear details.

Super-Resolution

Instance-Aware Dynamic Neural Network Quantization

4 code implementations CVPR 2022 Zhenhua Liu, Yunhe Wang, Kai Han, Siwei Ma, Wen Gao

However, natural images are of huge diversity with abundant content and using such a universal quantization configuration for all samples is not an optimal strategy.

Quantization

MAU: A Motion-Aware Unit for Video Prediction and Beyond

1 code implementation NeurIPS 2021 Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Yan Ye, Xiang Xinguang, Wen Gao

The attention module aims to learn an attention map based on the correlations between the current spatial state and the historical spatial states.

Action Recognition Video Prediction

Rethinking Lightweight Convolutional Neural Networks for Efficient and High-quality Pavement Crack Detection

2 code implementations13 Sep 2021 Kai Li, Jie Yang, Siwei Ma, Bo wang, Shanshe Wang, Yingjie Tian, Zhiquan Qi

For the second issue, we reconsider how to improve detection efficiency with excellent performance, and then propose our lightweight encoder-decoder architecture termed CarNet.

COAST: COntrollable Arbitrary-Sampling NeTwork for Compressive Sensing

1 code implementation15 Jul 2021 Di You, Jian Zhang, Jingfen Xie, Bin Chen, Siwei Ma

In this paper, we propose a novel COntrollable Arbitrary-Sampling neTwork, dubbed COAST, to solve CS problems of arbitrary-sampling matrices (including unseen sampling matrices) with one single model.

Blocking Compressive Sensing

Post-Training Quantization for Vision Transformer

no code implementations NeurIPS 2021 Zhenhua Liu, Yunhe Wang, Kai Han, Siwei Ma, Wen Gao

Recently, transformer has achieved remarkable performance on a variety of computer vision applications.

Quantization

Rate Distortion Characteristic Modeling for Neural Image Compression

no code implementations24 Jun 2021 Chuanmin Jia, Ziqing Ge, Shanshe Wang, Siwei Ma, Wen Gao

End-to-end optimized neural image compression (NIC) has obtained superior lossy compression performance recently.

Image Compression

Visual Analysis Motivated Rate-Distortion Model for Image Coding

no code implementations21 Apr 2021 Zhimeng Huang, Chuanmin Jia, Shanshe Wang, Siwei Ma

We first propose the region of interest for machine (ROIM) to evaluate the degree of importance for each coding tree unit (CTU) in visual analysis.

Image Classification object-detection +2

Thousand to One: Semantic Prior Modeling for Conceptual Coding

no code implementations12 Mar 2021 Jianhui Chang, Zhenghui Zhao, Lingbo Yang, Chuanmin Jia, Jian Zhang, Siwei Ma

To this end, we propose a novel end-to-end semantic prior modeling-based conceptual coding scheme towards extremely low bitrate image compression, which leverages semantic-wise deep representations as a unified prior for entropy estimation and texture synthesis.

Image Compression Semantic Segmentation +1

Intrinsic Temporal Regularization for High-resolution Human Video Synthesis

no code implementations11 Dec 2020 Lingbo Yang, Zhanning Gao, Peiran Ren, Siwei Ma, Wen Gao

Temporal consistency is crucial for extending image processing pipelines to the video domain, which is often enforced with flow-based warping error over adjacent frames.

Motion Estimation Vocal Bursts Intensity Prediction

Pre-Trained Image Processing Transformer

6 code implementations CVPR 2021 Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao

To maximally excavate the capability of transformer, we present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs.

 Ranked #1 on Single Image Deraining on Rain100L (using extra training data)

Color Image Denoising Contrastive Learning +2

Conceptual Compression via Deep Structure and Texture Synthesis

2 code implementations10 Nov 2020 Jianhui Chang, Zhenghui Zhao, Chuanmin Jia, Shiqi Wang, Lingbo Yang, Qi Mao, Jian Zhang, Siwei Ma

To this end, we propose a novel conceptual compression framework that encodes visual data into compact structure and texture representations, then decodes in a deep synthesis fashion, aiming to achieve better visual reconstruction quality, flexible content manipulation, and potential support for various vision tasks.

Texture Synthesis

Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors

1 code implementation2 Nov 2020 Qi Mao, Hung-Yu Tseng, Hsin-Ying Lee, Jia-Bin Huang, Siwei Ma, Ming-Hsuan Yang

Generating a smooth sequence of intermediate results bridges the gap of two different domains, facilitating the morphing effect across domains.

Attribute Image-to-Image Translation +1

Implicit Subspace Prior Learning for Dual-Blind Face Restoration

1 code implementation12 Oct 2020 Lingbo Yang, Pan Wang, Zhanning Gao, Shanshe Wang, Peiran Ren, Siwei Ma, Wen Gao

Face restoration is an inherently ill-posed problem, where additional prior constraints are typically considered crucial for mitigating such pathology.

Blind Face Restoration

A Similarity Inference Metric for RGB-Infrared Cross-Modality Person Re-identification

no code implementations3 Jul 2020 Mengxi Jia, Yunpeng Zhai, Shijian Lu, Siwei Ma, Jian Zhang

RGB-Infrared (IR) cross-modality person re-identification (re-ID), which aims to search an IR image in RGB gallery or vice versa, is a challenging task due to the large discrepancy between IR and RGB modalities.

Cross-Modality Person Re-identification Person Re-Identification

Towards Fine-grained Human Pose Transfer with Detail Replenishing Network

no code implementations26 May 2020 Lingbo Yang, Pan Wang, Chang Liu, Zhanning Gao, Peiran Ren, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Xian-Sheng Hua, Wen Gao

Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality.

Pose Transfer Retrieval

Iterative Network for Image Super-Resolution

1 code implementation20 May 2020 Yuqing Liu, Shiqi Wang, Jian Zhang, Shanshe Wang, Siwei Ma, Wen Gao

A novel iterative super-resolution network (ISRN) is proposed on top of the iterative optimization.

Image Super-Resolution SSIM

HiFaceGAN: Face Renovation via Collaborative Suppression and Replenishment

5 code implementations11 May 2020 Lingbo Yang, Chang Liu, Pan Wang, Shanshe Wang, Peiran Ren, Siwei Ma, Wen Gao

Existing face restoration researches typically relies on either the degradation prior or explicit guidance labels for training, which often results in limited generalization ability over real-world images with heterogeneous degradations and rich background contents.

Blind Face Restoration Face Hallucination +3

Towards Analysis-friendly Face Representation with Scalable Feature and Texture Compression

no code implementations21 Apr 2020 Shurun Wang, Shiqi Wang, Wenhan Yang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

In particular, we study the feature and texture compression in a scalable coding framework, where the base layer serves as the deep learning feature and enhancement layer targets to perfectly reconstruct the texture.

Image Compression

Learning to fool the speaker recognition

1 code implementation7 Apr 2020 Jiguo Li, Xinfeng Zhang, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao

Due to the widespread deployment of fingerprint/face/speaker recognition systems, attacking deep learning based biometric systems has drawn more and more attention.

Audio and Speech Processing Cryptography and Security Sound

Direct Speech-to-image Translation

1 code implementation7 Apr 2020 Jiguo Li, Xinfeng Zhang, Chuanmin Jia, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao

In this paper, we attempt to translate the speech signals into the image signals without the transcription stage.

Multimedia Sound Audio and Speech Processing

Universal Adversarial Perturbations Generative Network for Speaker Recognition

1 code implementation7 Apr 2020 Jiguo Li, Xinfeng Zhang, Chuanmin Jia, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao

Attacking deep learning based biometric systems has drawn more and more attention with the wide deployment of fingerprint/face/speaker recognition systems, given the fact that the neural networks are vulnerable to the adversarial examples, which have been intentionally perturbed to remain almost imperceptible for human.

Speaker Recognition

Knowledge Transfer via Student-Teacher Collaboration

no code implementations25 Sep 2019 Tianxiao Gao, Ruiqin Xiong, Zhenhua Liu, Siwei Ma, Feng Wu, Tiejun Huang, Wen Gao

One way to compress these heavy models is knowledge transfer (KT), in which a light student network is trained through absorbing the knowledge from a powerful teacher network.

Transfer Learning

Masked Non-Autoregressive Image Captioning

no code implementations3 Jun 2019 Junlong Gao, Xi Meng, Shiqi Wang, Xia Li, Shanshe Wang, Siwei Ma, Wen Gao

Existing captioning models often adopt the encoder-decoder architecture, where the decoder uses autoregressive decoding to generate captions, such that each token is generated sequentially given the preceding generated tokens.

Image Captioning Machine Translation +1

Reconstruction of Natural Visual Scenes from Neural Spikes with Deep Neural Networks

no code implementations30 Apr 2019 Yichen Zhang, Shanshan Jia, Yajing Zheng, Zhaofei Yu, Yonghong Tian, Siwei Ma, Tiejun Huang, Jian. K. Liu

The SID is an end-to-end decoder with one end as neural spikes and the other end as images, which can be trained directly such that visual scenes are reconstructed from spikes in a highly accurate fashion.

Self-critical n-step Training for Image Captioning

no code implementations CVPR 2019 Junlong Gao, Shiqi Wang, Shanshe Wang, Siwei Ma, Wen Gao

Existing methods for image captioning are usually trained by cross entropy loss, which leads to exposure bias and the inconsistency between the optimizing function and evaluation metrics.

Image Captioning

Image and Video Compression with Neural Networks: A Review

no code implementations7 Apr 2019 Siwei Ma, Xinfeng Zhang, Chuanmin Jia, Zhenghui Zhao, Shiqi Wang, Shanshe Wang

Deep convolution neural network (CNN) which makes the neural network resurge in recent years and has achieved great success in both artificial intelligent and signal processing fields, also provides a novel and promising solution for image and video compression.

Video Compression

Scalable Facial Image Compression with Deep Feature Reconstruction

no code implementations14 Mar 2019 Shurun Wang, Shiqi Wang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

In this paper, we propose a scalable image compression scheme, including the base layer for feature representation and enhancement layer for texture representation.

Image Compression

A Group Variational Transformation Neural Network for Fractional Interpolation of Video Coding

no code implementations19 Jun 2018 Sifeng Xia, Wenhan Yang, Yueyu Hu, Siwei Ma, Jiaying Liu

Then a group variational transformation technique is used to transform a group of copied shared feature maps to samples at different sub-pixel positions.

Multimedia

Spatial-Temporal Residue Network Based In-Loop Filter for Video Coding

no code implementations25 Sep 2017 Chuanmin Jia, Shiqi Wang, Xinfeng Zhang, Shanshe Wang, Siwei Ma

Deep learning has demonstrated tremendous break through in the area of image/video processing.

Multimedia

Globally Variance-Constrained Sparse Representation and Its Application in Image Set Coding

no code implementations17 Aug 2016 Xiang Zhang, Jiarui Sun, Siwei Ma, Zhouchen Lin, Jian Zhang, Shiqi Wang, Wen Gao

Therefore, introducing an accurate rate-constraint in sparse coding and dictionary learning becomes meaningful, which has not been fully exploited in the context of sparse representation.

Data Compression Dictionary Learning

Image Restoration Using Joint Statistical Modeling in Space-Transform Domain

no code implementations11 May 2014 Jian Zhang, Debin Zhao, Ruiqin Xiong, Siwei Ma, Wen Gao

This paper presents a novel strategy for high-fidelity image restoration by characterizing both local smoothness and nonlocal self-similarity of natural images in a unified statistical manner.

Deblurring Image Deblurring +3

Cannot find the paper you are looking for? You can Submit a new open access paper.