Search Results for author: Zheng-Jun Zha

Found 127 papers, 50 papers with code

Multi-perspective Memory Enhanced Network for Identifying Key Nodes in Social Networks

no code implementations • 22 Mar 2024 • Qiang Zhang, Jiawei Liu, Fanrui Zhang, Xiaoling Zhu, Zheng-Jun Zha

Existing key node identification methods usually consider node influence only from the propagation structure perspective and have insufficient generalization ability to unknown scenarios.

Blocking Graph Attention

Paper
Add Code

Hierarchical Information Enhancement Network for Cascade Prediction in Social Networks

no code implementations • 22 Mar 2024 • Fanrui Zhang, Jiawei Liu, Qiang Zhang, Xiaoling Zhu, Zheng-Jun Zha

In this work, we propose a novel Hierarchical Information Enhancement Network (HIENet) for cascade prediction.

Paper
Add Code

RelationVLM: Making Large Vision-Language Models Understand Visual Relations

no code implementations • 19 Mar 2024 • Zhipeng Huang, Zhizheng Zhang, Zheng-Jun Zha, Yan Lu, Baining Guo

The development of Large Vision-Language Models (LVLMs) is striving to catch up with the success of Large Language Models (LLMs), yet it faces more challenges to be resolved.

Language Modelling

Paper
Add Code

VisualCritic: Making LMMs Perceive Visual Quality Like Humans

no code implementations • 19 Mar 2024 • Zhipeng Huang, Zhizheng Zhang, Yiting Lu, Zheng-Jun Zha, Zhibo Chen, Baining Guo

In this paper, we explore this question and provide the answer "Yes!".

Instruction Following

Paper
Add Code

Event-based Asynchronous HDR Imaging by Temporal Incident Light Modulation

no code implementations • 14 Mar 2024 • Yuliang Wu, Ganchao Tan, Jinze Chen, Wei Zhai, Yang Cao, Zheng-Jun Zha

In this paper, we propose AsynHDR, a Pixel-Asynchronous HDR imaging system, based on key insights into the challenges in HDR imaging and the unique event-generating mechanism of Dynamic Vision Sensors (DVS).

Paper
Add Code

SCott: Accelerating Diffusion Models with Stochastic Consistency Distillation

no code implementations • 3 Mar 2024 • Hongjian Liu, Qingsong Xie, Zhijie Deng, Chen Chen, Shixiang Tang, Fueyang Fu, Zheng-Jun Zha, Haonan Lu

In contrast to vanilla consistency distillation (CD) which distills the ordinary differential equation solvers-based sampling process of a pretrained teacher model into a student, SCott explores the possibility and validates the efficacy of integrating stochastic differential equation (SDE) solvers into CD to fully unleash the potential of the teacher.

Text-to-Image Generation

Paper
Add Code

LEMON: Learning 3D Human-Object Interaction Relation from 2D Images

no code implementations • 14 Dec 2023 • Yuhang Yang, Wei Zhai, Hongchen Luo, Yang Cao, Zheng-Jun Zha

Which underexploit certain correlations between the interaction counterparts (human and object), and struggle to address the uncertainty in interactions.

Human-Object Interaction Detection Object +1

Paper
Add Code

CCM: Adding Conditional Controls to Text-to-Image Consistency Models

no code implementations • 12 Dec 2023 • Jie Xiao, Kai Zhu, Han Zhang, Zhiheng Liu, Yujun Shen, Yu Liu, Xueyang Fu, Zheng-Jun Zha

Consistency Models (CMs) have showed a promise in creating visual content efficiently and with high quality.

Paper
Add Code

Decoupling Degradation and Content Processing for Adverse Weather Image Restoration

no code implementations • 8 Dec 2023 • Xi Wang, Xueyang Fu, Peng-Tao Jiang, Jie Huang, Mi Zhou, Bo Li, Zheng-Jun Zha

The former facilitates channel-dependent degradation removal operation, allowing the network to tailor responses to various adverse weather types; the latter, by integrating Fourier's global properties into channel-independent content features, enhances network capacity for consistent global content reconstruction.

Image Restoration

Paper
Add Code

Revisiting Single Image Reflection Removal In the Wild

1 code implementation • 29 Nov 2023 • Yurui Zhu, Xueyang Fu, Peng-Tao Jiang, Hao Zhang, Qibin Sun, Jinwei Chen, Zheng-Jun Zha, Bo Li

This research focuses on the issue of single-image reflection removal (SIRR) in real-world conditions, examining it from two angles: the collection pipeline of real reflection pairs and the perception of real reflection locations.

Reflection Removal

Paper
Code

Self-supervised Cross-view Representation Reconstruction for Change Captioning

1 code implementation • ICCV 2023 • Yunbin Tu, Liang Li, Li Su, Zheng-Jun Zha, Chenggang Yan, Qingming Huang

Change captioning aims to describe the difference between a pair of similar images.

Caption Generation Hallucination

Paper
Code

Background Activation Suppression for Weakly Supervised Object Localization and Semantic Segmentation

2 code implementations • 22 Sep 2023 • Wei Zhai, Pingyu Wu, Kai Zhu, Yang Cao, Feng Wu, Zheng-Jun Zha

In addition, our method also achieves state-of-the-art weakly supervised semantic segmentation performance on the PASCAL VOC 2012 and MS COCO 2014 datasets.

Object Weakly-Supervised Object Localization +2

Paper
Code

BEVTrack: A Simple and Strong Baseline for 3D Single Object Tracking in Bird's-Eye View

1 code implementation • 5 Sep 2023 • Yuxiang Yang, Yingqi Deng, Jing Zhang, Jiahao Nie, Zheng-Jun Zha

The spatial information indicating objects' spatial adjacency across consecutive frames is crucial for effective object tracking.

3D Single Object Tracking Autonomous Driving +2

Paper
Code

Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained Vision-Language Models

no code implementations • ICCV 2023 • Kecheng Zheng, Wei Wu, Ruili Feng, Kai Zhu, Jiawei Liu, Deli Zhao, Zheng-Jun Zha, Wei Chen, Yujun Shen

To bring the useful knowledge back into light, we first identify a set of parameters that are important to a given downstream task, then attach a binary mask to each parameter, and finally optimize these masks on the downstream data with the parameters frozen.

Paper
Add Code

Adaptive Frequency Filters As Efficient Global Token Mixers

2 code implementations • ICCV 2023 • Zhipeng Huang, Zhizheng Zhang, Cuiling Lan, Zheng-Jun Zha, Yan Lu, Baining Guo

With this insight, we propose Adaptive Frequency Filtering (AFF) token mixer.

114

Paper
Code

Knowledge-Enhanced Hierarchical Information Correlation Learning for Multi-Modal Rumor Detection

no code implementations • 28 Jun 2023 • Jiawei Liu, Jingyi Xie, Fanrui Zhang, Qiang Zhang, Zheng-Jun Zha

The explosive growth of rumors with text and images on social media platforms has drawn great attention.

Paper
Add Code

DreamTime: An Improved Optimization Strategy for Text-to-3D Content Creation

no code implementations • 21 Jun 2023 • Yukun Huang, Jianan Wang, Yukai Shi, Xianbiao Qi, Zheng-Jun Zha, Lei Zhang

Text-to-image diffusion models pre-trained on billions of image-text pairs have recently enabled text-to-3D content creation by optimizing a randomly initialized Neural Radiance Fields (NeRF) with score distillation.

Image Generation Text to 3D

Paper
Add Code

Streaming Video Model

1 code implementation • CVPR 2023 • Yucheng Zhao, Chong Luo, Chuanxin Tang, Dongdong Chen, Noel Codella, Zheng-Jun Zha

We believe that the concept of streaming video model and the implementation of S-ViT are solid steps towards a unified deep learning architecture for video understanding.

Action Recognition Multiple Object Tracking +1

Paper
Code

Spatial-Aware Token for Weakly Supervised Object Localization

1 code implementation • ICCV 2023 • Pingyu Wu, Wei Zhai, Yang Cao, Jiebo Luo, Zheng-Jun Zha

Specifically, a spatial token is first introduced in the input space to aggregate representations for localization task.

Object Weakly-Supervised Object Localization

Paper
Code

Grounding 3D Object Affordance from 2D Interactions in Images

1 code implementation • ICCV 2023 • Yuhang Yang, Wei Zhai, Hongchen Luo, Yang Cao, Jiebo Luo, Zheng-Jun Zha

Comprehensive experiments on PIAD demonstrate the reliability of the proposed task and the superiority of our method.

Object

102

Paper
Code

Event-Guided Person Re-Identification via Sparse-Dense Complementary Learning

no code implementations • CVPR 2023 • Chengzhi Cao, Xueyang Fu, Hongjian Liu, Yukun Huang, Kunyu Wang, Jiebo Luo, Zheng-Jun Zha

Video-based person re-identification (Re-ID) is a prominent computer vision topic due to its wide range of video surveillance applications.

Representation Learning Video-Based Person Re-Identification

Paper
Add Code

Text-Driven Generative Domain Adaptation with Spectral Consistency Regularization

1 code implementation • ICCV 2023 • Zhenhuan Liu, Liang Li, Jiayu Xiao, Zheng-Jun Zha, Qingming Huang

The experiments demonstrate the effectiveness of our method to preserve the diversity of source domain and generate high fidelity target images.

Domain Adaptation

Paper
Code

Decoupling-and-Aggregating for Image Exposure Correction

no code implementations • CVPR 2023 • Yang Wang, Long Peng, Liang Li, Yang Cao, Zheng-Jun Zha

To this end, we inject the addition/difference operation into the convolution process and devise a Contrast Aware (CA) unit and a Detail Aware (DA) unit to facilitate the statistical and structural regularities modeling.

Paper
Add Code

Edge-Aware Regional Message Passing Controller for Image Forgery Localization

no code implementations • CVPR 2023 • Dong Li, Jiaying Zhu, Menglu Wang, Jiawei Liu, Xueyang Fu, Zheng-Jun Zha

In the second step, guided by the learnable edges, a region message passing controller is devised to weaken the message passing between the forged and authentic regions.

Binarization graph construction

Paper
Add Code

Learning Cross-Representation Affinity Consistency for Sparsely Supervised Biomedical Instance Segmentation

1 code implementation • ICCV 2023 • Xiaoyu Liu, Wei Huang, Zhiwei Xiong, Shenglong Zhou, Yueyi Zhang, Xuejin Chen, Zheng-Jun Zha, Feng Wu

Sparse instance-level supervision has recently been explored to address insufficient annotation in biomedical instance segmentation, which is easier to annotate crowded instances and better preserves instance completeness for 3D volumetric datasets compared to common semi-supervision. In this paper, we propose a sparsely supervised biomedical instance segmentation framework via cross-representation affinity consistency regularization.

Instance Segmentation Pseudo Label +1

Paper
Code

Self-Organizing Pathway Expansion for Non-Exemplar Class-Incremental Learning

no code implementations • ICCV 2023 • Kai Zhu, Kecheng Zheng, Ruili Feng, Deli Zhao, Yang Cao, Zheng-Jun Zha

Non-exemplar class-incremental learning aims to recognize both the old and new classes without access to old class samples.

Class Incremental Learning Incremental Learning

Paper
Add Code

Generalized UAV Object Detection via Frequency Domain Disentanglement

no code implementations • CVPR 2023 • Kunyu Wang, Xueyang Fu, Yukun Huang, Chengzhi Cao, Gege Shi, Zheng-Jun Zha

This loss enables the network to concentrate on extracting domain-invariant spectrum and domain-specific spectrum, so as to achieve better disentangling results.

Disentanglement Object +2

Paper
Add Code

Neural Dependencies Emerging from Learning Massive Categories

no code implementations • CVPR 2023 • Ruili Feng, Kecheng Zheng, Kai Zhu, Yujun Shen, Jian Zhao, Yukun Huang, Deli Zhao, Jingren Zhou, Michael Jordan, Zheng-Jun Zha

Through investigating the properties of the problem solution, we confirm that neural dependency is guaranteed by a redundant logit covariance matrix, which condition is easily met given massive categories, and that neural dependency is highly sparse, implying that one category correlates to only a few others.

Image Classification

Paper
Add Code

Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding

1 code implementation • 18 Jul 2022 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Zechao Li, Qi Tian, Qingming Huang

Second, most previous weakly supervised REG methods ignore the discriminative location and context of the referent, causing difficulties in distinguishing the target from other same-category objects.

Attribute Referring Expression +2

Paper
Code

Enhancement by Your Aesthetic: An Intelligible Unsupervised Personalized Enhancer for Low-Light Images

no code implementations • 15 Jul 2022 • Naishan Zheng, Jie Huang, Qi Zhu, Man Zhou, Feng Zhao, Zheng-Jun Zha

Low-light image enhancement is an inherently subjective process whose targets vary with the user's aesthetic.

Low-Light Image Enhancement

Paper
Add Code

Rank Diminishing in Deep Neural Networks

no code implementations • 13 Jun 2022 • Ruili Feng, Kecheng Zheng, Yukun Huang, Deli Zhao, Michael Jordan, Zheng-Jun Zha

By virtue of our numerical tools, we provide the first empirical analysis of the per-layer behavior of network rank in practical settings, i. e., ResNets, deep MLPs, and Transformers on ImageNet.

Paper
Add Code

Label Noise-Resistant Mean Teaching for Weakly Supervised Fake News Detection

no code implementations • 10 Jun 2022 • Jingyi Xie, Jiawei Liu, Zheng-Jun Zha

LNMT leverages unlabeled news and feedback comments of users to enlarge the amount of training data and facilitates model training by generating refined labels as weak supervision.

Fake News Detection Model Optimization

Paper
Add Code

Automatic Relation-aware Graph Network Proliferation

1 code implementation • CVPR 2022 • Shaofei Cai, Liang Li, Xinzhe Han, Jiebo Luo, Zheng-Jun Zha, Qingming Huang

However, the currently used graph search space overemphasizes learning node features and neglects mining hierarchical relational information.

Ranked #2 on Link Prediction on TSP/HCP Benchmark set

Graph Classification Graph Learning +5

Paper
Code

Principled Knowledge Extrapolation with GANs

no code implementations • 21 May 2022 • Ruili Feng, Jie Xiao, Kecheng Zheng, Deli Zhao, Jingren Zhou, Qibin Sun, Zheng-Jun Zha

Human can extrapolate well, generalize daily knowledge into unseen scenarios, raise and answer counterfactual questions.

counterfactual

Paper
Add Code

Degradation-agnostic Correspondence from Resolution-asymmetric Stereo

no code implementations • CVPR 2022 • Xihao Chen, Zhiwei Xiong, Zhen Cheng, Jiayong Peng, Yueyi Zhang, Zheng-Jun Zha

Interestingly, we find that, although a stereo matching network trained with the photometric loss is not optimal, its feature extractor can produce degradation-agnostic and matching-specific features.

Stereo Matching

Paper
Add Code

Unsupervised Coherent Video Cartoonization with Perceptual Motion Consistency

1 code implementation • 2 Apr 2022 • Zhenhuan Liu, Liang Li, Huajie Jiang, Xin Jin, Dandan Tu, Shuhui Wang, Zheng-Jun Zha

Furthermore, we devise the spatio-temporal correlative map as a style-independent, global-aware regularization on the perceptual motion consistency.

Optical Flow Estimation Style Transfer

Paper
Code

FAMLP: A Frequency-Aware MLP-Like Architecture For Domain Generalization

no code implementations • 24 Mar 2022 • Kecheng Zheng, Yang Cao, Kai Zhu, Ruijing Zhao, Zheng-Jun Zha

However, its generalization performance to heterogeneous tasks is inferior to other architectures (e. g., CNNs and transformers) due to the extensive retention of domain information.

Domain Generalization

Paper
Add Code

ProgressiveMotionSeg: Mutually Reinforced Framework for Event-Based Motion Segmentation

no code implementations • 22 Mar 2022 • Jinze Chen, Yang Wang, Yang Cao, Feng Wu, Zheng-Jun Zha

Dynamic Vision Sensor (DVS) can asynchronously output the events reflecting apparent motion of objects with microsecond resolution, and shows great application potential in monitoring and other fields.

Denoising Motion Estimation +1

Paper
Add Code

Location-Free Camouflage Generation Network

1 code implementation • 18 Mar 2022 • Yangyang Li, Wei Zhai, Yang Cao, Zheng-Jun Zha

However, these methods struggle in 1) efficiently generating camouflage images using foreground and background with arbitrary structure; 2) camouflaging foreground objects to regions with multiple appearances (e. g. the junction of the vegetation and the mountains), which limit their practical application.

Paper
Code

Self-Sustaining Representation Expansion for Non-Exemplar Class-Incremental Learning

2 code implementations • CVPR 2022 • Kai Zhu, Wei Zhai, Yang Cao, Jiebo Luo, Zheng-Jun Zha

Non-exemplar class-incremental learning is to recognize both the old and new classes when old class samples cannot be saved.

Class Incremental Learning Incremental Learning +1

681

Paper
Code

Few Shot Generative Model Adaption via Relaxed Spatial Structural Alignment

2 code implementations • CVPR 2022 • Jiayu Xiao, Liang Li, Chaofei Wang, Zheng-Jun Zha, Qingming Huang

A feasible solution is to start with a GAN well-trained on a large scale source domain and adapt it to the target domain with a few samples, termed as few shot generative model adaption.

Generative Adversarial Network

Paper
Code

Debiased Batch Normalization via Gaussian Process for Generalizable Person Re-Identification

no code implementations • 3 Mar 2022 • Jiawei Liu, Zhipeng Huang, Liang Li, Kecheng Zheng, Zheng-Jun Zha

In this paper, we propose a novel Debiased Batch Normalization via Gaussian Process approach (GDNorm) for generalizable person re-identification, which models the feature statistic estimation from BN layers as a dynamically self-refining Gaussian process to alleviate the bias to unseen domain for improving the generalization.

Generalizable Person Re-identification Representation Learning

Paper
Add Code

Modality-Adaptive Mixup and Invariant Decomposition for RGB-Infrared Person Re-Identification

no code implementations • 3 Mar 2022 • Zhipeng Huang, Jiawei Liu, Liang Li, Kecheng Zheng, Zheng-Jun Zha

RGB-infrared person re-identification is an emerging cross-modality re-identification task, which is very challenging due to significant modality discrepancy between RGB and infrared images.

Person Re-Identification

Paper
Add Code

Efficient Model-Driven Network for Shadow Removal

2 code implementations • AAAI 2022 • Yurui Zhu, Zeyu Xiao, Yanchi Fang, Xueyang Fu, Zhiwei Xiong, Zheng-Jun Zha

To address these issues, we first propose a new shadow illumination model for the shadow removal task.

Image Shadow Removal Shadow Removal

Paper
Code

Multi-Grained Spatio-Temporal Features Perceived Network for Event-Based Lip-Reading

no code implementations • CVPR 2022 • Ganchao Tan, Yang Wang, Han Han, Yang Cao, Feng Wu, Zheng-Jun Zha

To recognize words from the event data, we propose a novel Multi-grained Spatio-Temporal Features Perceived Network (MSTP) to perceive fine-grained spatio-temporal features from microsecond time-resolved event data.

Action Recognition Lip Reading

Paper
Add Code

Temporal Complementarity-Guided Reinforcement Learning for Image-to-Video Person Re-Identification

no code implementations • CVPR 2022 • Wei Wu, Jiawei Liu, Kecheng Zheng, Qibin Sun, Zheng-Jun Zha

Image-to-video person re-identification aims to retrieve the same pedestrian as the image-based query from a video-based gallery set.

Image-To-Video Person Re-Identification reinforcement-learning +4

Paper
Add Code

Bijective Mapping Network for Shadow Removal

2 code implementations • CVPR 2022 • Yurui Zhu, Jie Huang, Xueyang Fu, Feng Zhao, Qibin Sun, Zheng-Jun Zha

Shadow removal, which aims to restore the background in the shadow regions, is challenging due to the highly ill-posed nature.

Shadow Removal

Paper
Code

Lifelong Unsupervised Domain Adaptive Person Re-identification with Coordinated Anti-forgetting and Adaptation

no code implementations • CVPR 2022 • Zhipeng Huang, Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Peng Chu, Quanzeng You, Jiang Wang, Zicheng Liu, Zheng-Jun Zha

In this paper, to address more practical scenarios, we propose a new task, Lifelong Unsupervised Domain Adaptive (LUDA) person ReID.

Domain Adaptive Person Re-Identification Knowledge Distillation +4

Paper
Add Code

Calibrated Feature Decomposition for Generalizable Person Re-Identification

1 code implementation • 27 Nov 2021 • Kecheng Zheng, Jiawei Liu, Wei Wu, Liang Li, Zheng-Jun Zha

The calibrated person representation is subtly decomposed into the identity-relevant feature, domain feature, and the remaining entangled one.

Domain Generalization Generalizable Person Re-identification

Paper
Code

EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

1 code implementation • CVPR 2022 • Yaya Shi, Xu Yang, Haiyang Xu, Chunfeng Yuan, Bing Li, Weiming Hu, Zheng-Jun Zha

The datasets will be released to facilitate the development of video captioning metrics.

Language Modelling Video Captioning

Paper
Code

Edge-featured Graph Neural Architecture Search

no code implementations • 3 Sep 2021 • Shaofei Cai, Liang Li, Xinzhe Han, Zheng-Jun Zha, Qingming Huang

Recently, researchers study neural architecture search (NAS) to reduce the dependence of human expertise and explore better GNN architectures, but they over-emphasize entity features and ignore latent relation information concealed in the edges.

Neural Architecture Search

Paper
Add Code

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

1 code implementation • 30 Aug 2021 • Yucheng Zhao, Guangting Wang, Chuanxin Tang, Chong Luo, Wenjun Zeng, Zheng-Jun Zha

Convolutional neural networks (CNN) are the dominant deep neural network (DNN) architecture for computer vision.

190

Paper
Code

Multi-Modulation Network for Audio-Visual Event Localization

no code implementations • 26 Aug 2021 • Hao Wang, Zheng-Jun Zha, Liang Li, Xuejin Chen, Jiebo Luo

We propose a novel MultiModulation Network (M2N) to learn the above correlation and leverage it as semantic guidance to modulate the related auditory, visual, and fused features.

audio-visual event localization

Paper
Add Code

Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment

1 code implementation • ICCV 2021 • Heliang Zheng, Huan Yang, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo

And the reference space is optimized to capture deep image priors that are useful for quality assessment.

Image Quality Assessment Image Restoration +1

Paper
Code

Self-Supervised Visual Representations Learning by Contrastive Mask Prediction

no code implementations • ICCV 2021 • Yucheng Zhao, Guangting Wang, Chong Luo, Wenjun Zeng, Zheng-Jun Zha

In this paper, we propose a novel contrastive mask prediction (CMP) task for visual representation learning and design a mask contrast (MaskCo) framework to implement the idea.

Representation Learning Self-Supervised Learning

Paper
Add Code

Pose-Guided Feature Learning with Knowledge Distillation for Occluded Person Re-Identification

no code implementations • 31 Jul 2021 • Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Jiawei Liu, Zhizheng Zhang, Zheng-Jun Zha

Occluded person re-identification (ReID) aims to match person images with occlusion.

Knowledge Distillation Person Re-Identification

Paper
Add Code

Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers

1 code implementation • 27 Jul 2021 • Wen Wang, Yang Cao, Jing Zhang, Fengxiang He, Zheng-Jun Zha, Yonggang Wen, DaCheng Tao

In DQFA, a novel domain query is used to aggregate and align global context from the token sequence of both domains.

Domain Adaptation Object +2

Paper
Code

Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning

1 code implementation • CVPR 2021 • Kai Zhu, Yang Cao, Wei Zhai, Jie Cheng, Zheng-Jun Zha

Few-shot class-incremental learning is to recognize the new classes given few samples and not forget the old classes.

Few-Shot Class-Incremental Learning Incremental Learning +1

Paper
Code

Disentangle Your Dense Object Detector

2 code implementations • 7 Jul 2021 • Zehui Chen, Chenhongyi Yang, Qiaofei Li, Feng Zhao, Zheng-Jun Zha, Feng Wu

Extensive experiments on MS COCO benchmark show that our approach can lead to 2. 0 mAP, 2. 4 mAP and 2. 2 mAP absolute improvements on RetinaNet, FCOS, and ATSS baselines with negligible extra overhead.

Disentanglement Object +2

27,678

Paper
Code

Structured Multi-Level Interaction Network for Video Moment Localization via Language Query

no code implementations • CVPR 2021 • Hao Wang, Zheng-Jun Zha, Liang Li, Dong Liu, Jiebo Luo

In particular, for cross-modal interaction, we interact the sentence-level query with the whole moment while interact the word-level query with content and boundary, as in a coarse-to-fine manner.

Sentence

Paper
Add Code

Light Field Super-Resolution With Zero-Shot Learning

no code implementations • CVPR 2021 • Zhen Cheng, Zhiwei Xiong, Chang Chen, Dong Liu, Zheng-Jun Zha

To fill this gap, we propose a zero-shot learning framework for light field SR, which learns a mapping to super-resolve the reference view with examples extracted solely from the input low-resolution light field itself.

Super-Resolution Zero-Shot Learning

Paper
Add Code

Image De-Raining via Continual Learning

no code implementations • CVPR 2021 • Man Zhou, Jie Xiao, Yifan Chang, Xueyang Fu, Aiping Liu, Jinshan Pan, Zheng-Jun Zha

The proposed model is capable of achieving superior performance on both inhomogeneous and incremental datasets, and is promising for highly compact systems to gradually learn myriad regularities of the different types of rain streaks.

Continual Learning

Paper
Add Code

Adaptive Domain-Specific Normalization for Generalizable Person Re-Identification

no code implementations • 7 May 2021 • Jiawei Liu, Zhipeng Huang, Kecheng Zheng, Dong Liu, Xiaoyan Sun, Zheng-Jun Zha

It describes unseen target domain as a combination of the known source ones, and explicitly learns domain-specific representation with target distribution to improve the model's generalization by a meta-learning pipeline.

Generalizable Person Re-identification Meta-Learning

Paper
Add Code

Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos

no code implementations • CVPR 2021 • Jiawei Liu, Zheng-Jun Zha, Wei Wu, Kecheng Zheng, Qibin Sun

The key factor for video person re-identification is to effectively exploit both spatial and temporal clues from video sequences.

Ranked #10 on Video Deinterlacing on MSU Deinterlacer Benchmark

Video-Based Person Re-Identification Video Deinterlacing

Paper
Add Code

Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval

no code implementations • 29 Mar 2021 • Rui Zhao, Kecheng Zheng, Zheng-Jun Zha, Hongtao Xie, Jiebo Luo

The cross-modal memory module is employed to record the instance embeddings of all the datasets for global negative mining.

Retrieval Text Retrieval +1

Paper
Add Code

Rethinking Graph Neural Architecture Search from Message-passing

1 code implementation • CVPR 2021 • Shaofei Cai, Liang Li, Jincan Deng, Beichen Zhang, Zheng-Jun Zha, Li Su, Qingming Huang

Inspired by the strong searching capability of neural architecture search (NAS) in CNN, this paper proposes Graph Neural Architecture Search (GNAS) with novel-designed search space.

feature selection Neural Architecture Search

Paper
Code

Group-aware Label Transfer for Domain Adaptive Person Re-identification

1 code implementation • CVPR 2021 • Kecheng Zheng, Wu Liu, Lingxiao He, Tao Mei, Jiebo Luo, Zheng-Jun Zha

In this paper, we propose a Group-aware Label Transfer (GLT) algorithm, which enables the online interaction and mutual promotion of pseudo-label prediction and representation learning.

Attribute Clustering +5

141

Paper
Code

Synergy Between Semantic Segmentation and Image Denoising via Alternate Boosting

no code implementations • 24 Feb 2021 • Shunxin Xu, Ke Sun, Dong Liu, Zhiwei Xiong, Zheng-Jun Zha

We observe that not only denoising helps combat the drop of segmentation accuracy due to noise, but also pixel-wise semantic information boosts the capability of denoising.

Image Denoising Segmentation +1

Paper
Add Code

General-Purpose Speech Representation Learning through a Self-Supervised Multi-Granularity Framework

no code implementations • 3 Feb 2021 • Yucheng Zhao, Dacheng Yin, Chong Luo, Zhiyuan Zhao, Chuanxin Tang, Wenjun Zeng, Zheng-Jun Zha

This paper presents a self-supervised learning framework, named MGF, for general-purpose speech representation learning.

Classification Emotion Classification +6

Paper
Add Code

VAE^2: Preventing Posterior Collapse of Variational Video Predictions in the Wild

no code implementations • 28 Jan 2021 • Yizhou Zhou, Chong Luo, Xiaoyan Sun, Zheng-Jun Zha, Wenjun Zeng

We believe that VAE$^2$ is also applicable to other stochastic sequence prediction problems where training data are lack of stochasticity.

Video Prediction

Paper
Add Code

Improving De-Raining Generalization via Neural Reorganization

no code implementations • ICCV 2021 • Jie Xiao, Man Zhou, Xueyang Fu, Aiping Liu, Zheng-Jun Zha

Equipped with our NR algorithm, the deep model can be trained on a list of synthetic rainy datasets by overcoming catastrophic forgetting, making it a general-version de-raining network.

Knowledge Distillation

Paper
Add Code

Cross-Patch Graph Convolutional Network for Image Denoising

no code implementations • ICCV 2021 • Yao Li, Xueyang Fu, Zheng-Jun Zha

However, the real noisy images in practical are mostly of high resolution rather than the cropped small patches and the vanilla training strategies ignore the cross-patch contextual dependency in the whole image.

Image Denoising

Paper
Add Code

Learning Dual Priors for JPEG Compression Artifacts Removal

no code implementations • ICCV 2021 • Xueyang Fu, Xi Wang, Aiping Liu, Junwei Han, Zheng-Jun Zha

Specifically, we design a variational model to formulate the image de-blocking problem and propose two prior terms for the image content and gradient, respectively.

Blocking

Paper
Add Code

Attack-Guided Perceptual Data Generation for Real-World Re-Identification

no code implementations • ICCV 2021 • Yukun Huang, Xueyang Fu, Zheng-Jun Zha

In unconstrained real-world surveillance scenarios, person re-identification (Re-ID) models usually suffer from different low-level perceptual variations, e. g., cross-resolution and insufficient lighting.

Person Re-Identification Representation Learning

Paper
Add Code

Exploiting Sample Uncertainty for Domain Adaptive Person Re-Identification

1 code implementation • 16 Dec 2020 • Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, Zheng-Jun Zha

Based on this finding, we propose to exploit the uncertainty (measured by consistency levels) to evaluate the reliability of the pseudo-label of a sample and incorporate the uncertainty to re-weight its contribution within various ReID losses, including the identity (ID) classification loss per sample, the triplet loss, and the contrastive loss.

Clustering Domain Adaptive Person Re-Identification +3

141

Paper
Code

Learning Semantic-aware Normalization for Generative Adversarial Networks

1 code implementation • NeurIPS 2020 • Heliang Zheng, Jianlong Fu, Yanhong Zeng, Jiebo Luo, Zheng-Jun Zha

Such a model disentangles latent factors according to the semantic of feature channels by channel-/group- wise fusion of latent codes and feature channels.

Image Inpainting Unconditional Image Generation

Paper
Code

Hierarchical Granularity Transfer Learning

no code implementations • NeurIPS 2020 • Shaobo Min, Hongtao Xie, Hantao Yao, Xuran Deng, Zheng-Jun Zha, Yongdong Zhang

In this paper, we introduce a new task, named Hierarchical Granularity Transfer Learning (HGTL), to recognize sub-level categories with basic-level annotations and semantic descriptions for hierarchical categories.

Transfer Learning

Paper
Add Code

Hierarchical Gumbel Attention Network for Text-based Person Search

no code implementations • 10 Oct 2020 • Kecheng Zheng, Wu Liu, Jiawei Liu, Zheng-Jun Zha, Tao Mei

This hard selection strategy is able to fuse the strong-relevant multi-modality features for alleviating the problem of matching redundancy.

Ranked #15 on Text based Person Retrieval on CUHK-PEDES

Image Retrieval Image-to-Text Retrieval +6

Paper
Add Code

Temporal Attribute-Appearance Learning Network for Video-based Person Re-Identification

no code implementations • 9 Sep 2020 • Jiawei Liu, Xierong Zhu, Zheng-Jun Zha

TALNet simultaneously exploits human attributes and appearance to learn comprehensive and effective pedestrian representations from videos.

Attribute Multi-Task Learning +1

Paper
Add Code

DeepFacePencil: Creating Face Images from Freehand Sketches

1 code implementation • 31 Aug 2020 • Yuhang Li, Xuejin Chen, Binxin Yang, Zihan Chen, Zhihua Cheng, Zheng-Jun Zha

In this paper, we explore the task of generating photo-realistic face images from hand-drawn sketches.

Image-to-Image Translation Translation

Paper
Code

Nighttime Dehazing with a Synthetic Benchmark

1 code implementation • 10 Aug 2020 • Jing Zhang, Yang Cao, Zheng-Jun Zha, DaCheng Tao

To address this issue, we propose a novel synthetic method called 3R to simulate nighttime hazy images from daytime clear images, which first reconstructs the scene geometry, then simulates the light rays and object reflectance, and finally renders the haze effects.

Paper
Code

Learning to Discretely Compose Reasoning Module Networks for Video Captioning

1 code implementation • 17 Jul 2020 • Ganchao Tan, Daqing Liu, Meng Wang, Zheng-Jun Zha

However, existing visual reasoning methods designed for visual question answering are not appropriate to video captioning, for it requires more complex visual reasoning on videos over both space and time, and dynamic module composition along the generation process.

Question Answering Sentence +3

Paper
Code

Memory-Augmented Relation Network for Few-Shot Learning

no code implementations • 9 May 2020 • Jun He, Richang Hong, Xueliang Liu, Mingliang Xu, Zheng-Jun Zha, Meng Wang

Metric-based few-shot learning methods concentrate on learning transferable feature embedding that generalizes well from seen categories to unseen categories under the supervision of limited number of labelled instances.

Few-Shot Learning Metric Learning +2

Paper
Add Code

Self-Supervised Tuning for Few-Shot Segmentation

no code implementations • 12 Apr 2020 • Kai Zhu, Wei Zhai, Zheng-Jun Zha, Yang Cao

Few-shot segmentation aims at assigning a category label to each image pixel with few annotated samples.

Meta-Learning Segmentation

Paper
Add Code

ContourNet: Taking a Further Step toward Accurate Arbitrary-shaped Scene Text Detection

1 code implementation • CVPR 2020 • Yuxin Wang, Hongtao Xie, Zheng-Jun Zha, Mengting Xing, Zilong Fu, Yongdong Zhang

Then a novel Local Orthogonal Texture-aware Module (LOTM) models the local texture information of proposal features in two orthogonal directions and represents text region with a set of contour points.

Region Proposal Scene Text Detection +1

226

Paper
Code

Parsing-based View-aware Embedding Network for Vehicle Re-Identification

1 code implementation • CVPR 2020 • Dechao Meng, Liang Li, Xuejing Liu, Yadong Li, Shijie Yang, Zheng-Jun Zha, Xingyu Gao, Shuhui Wang, Qingming Huang

Vehicle Re-Identification is to find images of the same vehicle from various views in the cross-camera scenario.

Vehicle Re-Identification

101

Paper
Code

Real-world Person Re-Identification via Degradation Invariance Learning

no code implementations • CVPR 2020 • Yukun Huang, Zheng-Jun Zha, Xueyang Fu, Richang Hong, Liang Li

Person re-identification (Re-ID) in real-world scenarios usually suffers from various degradation factors, e. g., low-resolution, weak illumination, blurring and adverse weather.

Image Restoration Person Re-Identification +2

Paper
Add Code

Co-Saliency Spatio-Temporal Interaction Network for Person Re-Identification in Videos

no code implementations • 10 Apr 2020 • Jiawei Liu, Zheng-Jun Zha, Xierong Zhu, Na Jiang

Person re-identification aims at identifying a certain pedestrian across non-overlapping camera networks.

Person Re-Identification

Paper
Add Code

Stacked Convolutional Deep Encoding Network for Video-Text Retrieval

no code implementations • 10 Apr 2020 • Rui Zhao, Kecheng Zheng, Zheng-Jun Zha

Existing dominant approaches for cross-modal video-text retrieval task are to learn a joint embedding space to measure the cross-modal similarity.

Language Modelling Retrieval +2

Paper
Add Code

State-Relabeling Adversarial Active Learning

1 code implementation • CVPR 2020 • Beichen Zhang, Liang Li, Shijie Yang, Shuhui Wang, Zheng-Jun Zha, Qingming Huang

In this paper, we propose a state relabeling adversarial active learning model (SRAAL), that leverages both the annotation and the labeled/unlabeled state information for deriving the most informative unlabeled samples.

Active Learning

Paper
Code

Spatiotemporal Fusion in 3D CNNs: A Probabilistic View

no code implementations • CVPR 2020 • Yizhou Zhou, Xiaoyan Sun, Chong Luo, Zheng-Jun Zha, Wen-Jun Zeng

Based on the probability space, we further generate new fusion strategies which achieve the state-of-the-art performance on four well-known action recognition datasets.

Action Recognition In Videos Temporal Action Localization

Paper
Add Code

Iterative Context-Aware Graph Inference for Visual Dialog

1 code implementation • CVPR 2020 • Dan Guo, Hui Wang, Hanwang Zhang, Zheng-Jun Zha, Meng Wang

Visual dialog is a challenging task that requires the comprehension of the semantic dependencies among implicit visual and textual contexts.

Ranked #12 on Visual Dialog on VisDial v0.9 val

Graph Attention Graph Embedding +2

Paper
Code

Domain-aware Visual Bias Eliminating for Generalized Zero-Shot Learning

1 code implementation • CVPR 2020 • Shaobo Min, Hantao Yao, Hongtao Xie, Chaoqun Wang, Zheng-Jun Zha, Yongdong Zhang

Recent methods focus on learning a unified semantic-aligned visual representation to transfer knowledge between two domains, while ignoring the effect of semantic-free visual representation in alleviating the biased recognition problem.

Generalized Zero-Shot Learning

Paper
Code

Multi-Objective Matrix Normalization for Fine-grained Visual Recognition

1 code implementation • 30 Mar 2020 • Shaobo Min, Hantao Yao, Hongtao Xie, Zheng-Jun Zha, Yongdong Zhang

In this paper, we propose an efficient Multi-Objective Matrix Normalization (MOMN) method that can simultaneously normalize a bilinear representation in terms of square-root, low-rank, and sparsity.

Fine-Grained Visual Recognition

Paper
Code

Object Relational Graph with Teacher-Recommended Learning for Video Captioning

no code implementations • CVPR 2020 • Ziqi Zhang, Yaya Shi, Chunfeng Yuan, Bing Li, Peijin Wang, Weiming Hu, Zheng-Jun Zha

In this paper, we propose a complete video captioning system including both a novel model and an effective training strategy.

Ranked #9 on Video Captioning on VATEX (using extra training data)

Language Modelling Video Captioning

Paper
Add Code

Convolutional Dictionary Pair Learning Network for Image Representation Learning

no code implementations • 17 Dec 2019 • Zhao Zhang, Yulin Sun, Yang Wang, Zheng-Jun Zha, Shuicheng Yan, Meng Wang

To address this issue, we propose a novel generalized end-to-end representation learning architecture, dubbed Convolutional Dictionary Pair Learning Network (CDPL-Net) in this paper, which integrates the learning schemes of the CNN and dictionary pair learning into a unified framework.

Dictionary Learning Representation Learning

Paper
Add Code

Identity Preserve Transform: Understand What Activity Classification Models Have Learnt

no code implementations • 13 Dec 2019 • Jialing Lyu, Weichao Qiu, Xinyue Wei, Yi Zhang, Alan Yuille, Zheng-Jun Zha

This can explain why an activity classification model usually fails to generalize to datasets it is not trained on.

Classification General Classification

Paper
Add Code

Deep Self-representative Concept Factorization Network for Representation Learning

no code implementations • 13 Dec 2019 • Yan Zhang, Zhao Zhang, Zheng Zhang, Mingbo Zhao, Li Zhang, Zheng-Jun Zha, Meng Wang

In this paper, we investigate the unsupervised deep representation learning issue and technically propose a novel framework called Deep Self-representative Concept Factorization Network (DSCF-Net), for clustering deep features.

Clustering Representation Learning

Paper
Add Code

Abstract Reasoning with Distracting Features

1 code implementation • NeurIPS 2019 • Kecheng Zheng, Zheng-Jun Zha, Wei Wei

Abstraction reasoning is a long-standing challenge in artificial intelligence.

Paper
Code

Progressive Retinex: Mutually Reinforced Illumination-Noise Perception Network for Low Light Image Enhancement

no code implementations • 26 Nov 2019 • Yang Wang, Yang Cao, Zheng-Jun Zha, Jing Zhang, Zhiwei Xiong, Wei zhang, Feng Wu

Contrast enhancement and noise removal are coupled problems for low-light image enhancement.

Computational Efficiency Image Generation +1

Paper
Add Code

Learning Deep Bilinear Transformation for Fine-grained Image Representation

1 code implementation • NeurIPS 2019 • Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo

However, the computational cost to learn pairwise interactions between deep feature channels is prohibitively expensive, which restricts this powerful transformation to be used in deep neural networks.

Fine-Grained Image Recognition

105

Paper
Code

LinesToFacePhoto: Face Photo Generation from Lines with Conditional Self-Attention Generative Adversarial Network

no code implementations • 20 Oct 2019 • Yuhang Li, Xuejin Chen, Feng Wu, Zheng-Jun Zha

The large-scale discriminator enforces the completeness of global structures and the small-scale discriminator encourages fine details, thereby enhancing the realism of generated face images.

Generative Adversarial Network

Paper
Add Code

Knowledge-guided Pairwise Reconstruction Network for Weakly Supervised Referring Expression Grounding

1 code implementation • 5 Sep 2019 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Li Su, Qingming Huang

Weakly supervised referring expression grounding (REG) aims at localizing the referential entity in an image according to linguistic query, where the mapping between the image region (proposal) and the query is unknown in the training stage.

Object Referring Expression +2

Paper
Code

Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding

1 code implementation • ICCV 2019 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Dechao Meng, Qingming Huang

It builds the correspondence between image region proposal and query in an adaptive manner: adaptive grounding and collaborative reconstruction.

Attribute Referring Expression +1

Paper
Code

Adaptive Structure-constrained Robust Latent Low-Rank Coding for Image Recovery

no code implementations • 21 Aug 2019 • Zhao Zhang, Lei Wang, Sheng Li, Yang Wang, Zheng Zhang, Zheng-Jun Zha, Meng Wang

Specifically, AS-LRC performs the latent decomposition of given data into a low-rank reconstruction by a block-diagonal codes matrix, a group sparse locality-adaptive salient feature part and a sparse error part.

Representation Learning

Paper
Add Code

Domain-Specific Embedding Network for Zero-Shot Recognition

1 code implementation • 12 Aug 2019 • Shaobo Min, Hantao Yao, Hongtao Xie, Zheng-Jun Zha, Yongdong Zhang

In contrast to previous methods, the DSEN decomposes the domain-shared projection function into one domain-invariant and two domain-specific sub-functions to explore the similarities and differences between two domains.

Zero-Shot Learning

Paper
Code

Robust Subspace Discovery by Block-diagonal Adaptive Locality-constrained Representation

no code implementations • 4 Aug 2019 • Zhao Zhang, Jiahuan Ren, Sheng Li, Richang Hong, Zheng-Jun Zha, Meng Wang

Leveraging on the Frobenius-norm based latent low-rank representation model, rBDLR jointly learns the coding coefficients and salient features, and improves the results by enhancing the robustness to outliers and errors in given data, preserving local information of salient features adaptively and ensuring the block-diagonal structures of the coefficients.

Representation Learning

Paper
Add Code

Structure-Aware Residual Pyramid Network for Monocular Depth Estimation

1 code implementation • 13 Jul 2019 • Xiaotian Chen, Xuejin Chen, Zheng-Jun Zha

We propose a Residual Pyramid Decoder (RPD) which expresses global scene structure in upper levels to represent layouts, and local structure in lower levels to present shape details.

Ranked #55 on Monocular Depth Estimation on NYU-Depth V2

Depth Prediction Monocular Depth Estimation +1

Paper
Code

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text

no code implementations • ACL 2019 • Jianxing Yu, Zheng-Jun Zha, Jian Yin

This paper focuses on the topic of inferential machine comprehension, which aims to fully understand the meanings of given text to answer generic questions, especially the ones needed reasoning skills.

Reading Comprehension

Paper
Add Code

Posterior-Guided Neural Architecture Search

1 code implementation • 23 Jun 2019 • Yizhou Zhou, Xiaoyan Sun, Chong Luo, Zheng-Jun Zha, Wen-Jun Zeng

Accordingly, a hybrid network representation is presented which enables us to leverage the Variational Dropout so that the approximation of the posterior distribution becomes fully gradient-based and highly efficient.

Image Classification Neural Architecture Search

Paper
Code

Joint Visual Grounding with Language Scene Graphs

no code implementations • 9 Jun 2019 • Daqing Liu, Hanwang Zhang, Zheng-Jun Zha, Meng Wang, Qianru Sun

In this paper, we alleviate the missing-annotation problem and enable the joint reasoning by leveraging the language scene graph which covers both labeled referent and unlabeled contexts (other objects, attributes, and relationships).

Referring Expression Visual Grounding

Paper
Add Code

Context-Aware Visual Policy Network for Fine-Grained Image Captioning

1 code implementation • 6 Jun 2019 • Zheng-Jun Zha, Daqing Liu, Hanwang Zhang, Yongdong Zhang, Feng Wu

With the maturity of visual detection techniques, we are more ambitious in describing visual content with open-vocabulary, fine-grained and free-form language, i. e., the task of image captioning.

Image Captioning Image Paragraph Captioning +2

Paper
Code

One-Shot Texture Retrieval with Global Context Metric

no code implementations • 16 May 2019 • Kai Zhu, Wei Zhai, Zheng-Jun Zha, Yang Cao

In this paper, we tackle one-shot texture retrieval: given an example of a new reference texture, detect and segment all the pixels of the same texture category within an arbitrary image.

Relation Relation Network +2

Paper
Add Code

Multimodal Semantic Attention Network for Video Captioning

no code implementations • 8 May 2019 • Liang Sun, Bing Li, Chunfeng Yuan, Zheng-Jun Zha, Weiming Hu

Inspired by the fact that different modalities in videos carry complementary information, we propose a Multimodal Semantic Attention Network(MSAN), which is a new encoder-decoder framework incorporating multimodal semantic attributes for video captioning.

Attribute General Classification +2

Paper
Add Code

Camera Lens Super-Resolution

1 code implementation • CVPR 2019 • Chang Chen, Zhiwei Xiong, Xinmei Tian, Zheng-Jun Zha, Feng Wu

Existing methods for single image super-resolution (SR) are typically evaluated with synthetic degradation models such as bicubic or Gaussian downsampling.

Image Super-Resolution

170

Paper
Code

Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-grained Image Recognition

1 code implementation • CVPR 2019 • Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo

Learning subtle yet discriminative features (e. g., beak and eyes for a bird) plays a significant role in fine-grained image recognition.

Ranked #1 on Fine-Grained Image Classification on iNaturalist

Fine-Grained Image Classification Fine-Grained Image Recognition

219

Paper
Code

Making History Matter: History-Advantage Sequence Training for Visual Dialog

no code implementations • ICCV 2019 • Tianhao Yang, Zheng-Jun Zha, Hanwang Zhang

We study the multi-round response generation in visual dialog, where a response is generated according to a visually grounded conversational history.

Ranked #10 on Visual Dialog on VisDial v0.9 val

Answer Generation Response Generation +2

Paper
Add Code

Learning to Assemble Neural Module Tree Networks for Visual Grounding

no code implementations • ICCV 2019 • Daqing Liu, Hanwang Zhang, Feng Wu, Zheng-Jun Zha

In particular, we develop a novel modular network called Neural Module Tree network (NMTree) that regularizes the visual grounding along the dependency parsing tree of the sentence, where each node is a neural module that calculates visual attention according to its linguistic feature, and the grounding score is accumulated in a bottom-up direction where as needed.

Dependency Parsing Natural Language Visual Grounding +5

Paper
Add Code

CA3Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification

no code implementations • 19 Nov 2018 • Jiawei Liu, Zheng-Jun Zha, Hongtao Xie, Zhiwei Xiong, Yongdong Zhang

An appearance network is developed to learn appearance features from the full body, horizontal and vertical body parts of pedestrians with spatial dependencies among body parts.

Attribute Multi-Task Learning +1

Paper
Add Code

Towards Human-Level License Plate Recognition

no code implementations • ECCV 2018 • Jiafan Zhuang, Saihui Hou, Zilei Wang, Zheng-Jun Zha

License plate recognition (LPR) is a fundamental component of various intelligent transport systems, which is always expected to be accurate and efficient enough.

License Plate Recognition Semantic Segmentation

Paper
Add Code

Context-Aware Visual Policy Network for Sequence-Level Image Captioning

1 code implementation • 16 Aug 2018 • Daqing Liu, Zheng-Jun Zha, Hanwang Zhang, Yongdong Zhang, Feng Wu

To fill the gap, we propose a Context-Aware Visual Policy network (CAVP) for sequence-level image captioning.

Image Captioning Reinforcement Learning (RL)

Paper
Code

A Two-Stream Mutual Attention Network for Semi-supervised Biomedical Segmentation with Noisy Labels

no code implementations • 31 Jul 2018 • Shaobo Min, Xuejin Chen, Zheng-Jun Zha, Feng Wu, Yongdong Zhang

\begin{abstract} Learning-based methods suffer from a deficiency of clean annotations, especially in biomedical segmentation.

Paper
Add Code

MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition

no code implementations • CVPR 2018 • Yizhou Zhou, Xiaoyan Sun, Zheng-Jun Zha, Wen-Jun Zeng

Recent attempts use 3D convolutional neural networks (CNNs) to explore spatio-temporal information for human action recognition.

Action Recognition Temporal Action Localization

Paper
Add Code

Frank-Wolfe Network: An Interpretable Deep Structure for Non-Sparse Coding

1 code implementation • 28 Feb 2018 • Dong Liu, Ke Sun, Zhangyang Wang, Runsheng Liu, Zheng-Jun Zha

We propose an interpretable deep structure namely Frank-Wolfe Network (F-W Net), whose architecture is inspired by unrolling and truncating the Frank-Wolfe algorithm for solving an $L_p$-norm constrained problem with $p\geq 1$.

Handwritten Digit Recognition Image Denoising +2

Paper
Code

Learning Compact Appearance Representation for Video-based Person Re-Identification

no code implementations • 21 Feb 2017 • Wei Zhang, Shengnan Hu, Kan Liu, Zheng-Jun Zha

This paper presents a novel approach for video-based person re-identification using multiple Convolutional Neural Networks (CNNs).

Video-Based Person Re-Identification

Paper
Add Code

Comparative Deep Learning of Hybrid Representations for Image Recommendations

no code implementations • CVPR 2016 • Chenyi Lei, Dong Liu, Weiping Li, Zheng-Jun Zha, Houqiang Li

In many image-related tasks, learning expressive and discriminative representations of images is essential, and deep learning has been studied for automating the learning of such representations.

Paper
Add Code

Answering Opinion Questions on Products by Exploiting Hierarchical Organization of Consumer Reviews

no code implementations • EMNLP 2012 • Jianxing Yu, Zheng-Jun Zha, Tat-Seng Chua

Question Answering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.