Search Results for author: Wenjun Zeng

We also equip Inter-X with versatile annotations of more than 34K fine-grained human part-level textual descriptions, semantic interaction categories, interaction order, and the relationship and personality of the subjects.

Paper
Add Code

RLLTE: Long-Term Evolution Project of Reinforcement Learning

2 code implementations • 28 Sep 2023 • Mingqi Yuan, Zequn Zhang, Yang Xu, Shihao Luo, Bo Li, Xin Jin, Wenjun Zeng

We present RLLTE: a long-term evolution, extremely modular, and open-source framework for reinforcement learning (RL) research and application.

Language Modelling Large Language Model +2

435

Paper
Code

Diffusion Models for Image Restoration and Enhancement -- A Comprehensive Survey

1 code implementation • 18 Aug 2023 • Xin Li, Yulin Ren, Xin Jin, Cuiling Lan, Xingrui Wang, Wenjun Zeng, Xinchao Wang, Zhibo Chen

Image restoration (IR) has been an indispensable and challenging task in the low-level vision field, which strives to improve the subjective quality of images distorted by various forms of degradation.

Deblurring Image Restoration +2

454

Paper
Code

One at a Time: Progressive Multi-step Volumetric Probability Learning for Reliable 3D Scene Perception

no code implementations • 22 Jun 2023 • Bohan Li, Yasheng Sun, Jingxin Dong, Zheng Zhu, Jinming Liu, Xin Jin, Wenjun Zeng

Numerous studies have investigated the pivotal role of reliable 3D volume representation in scene perception tasks, such as multi-view stereo (MVS) and semantic scene completion (SSC).

Depth Estimation Representation Learning

Paper
Add Code

Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement Learning

no code implementations • 24 May 2023 • Qi Wang, Junming Yang, Yunbo Wang, Xin Jin, Wenjun Zeng, Xiaokang Yang

Training offline reinforcement learning (RL) models using visual inputs poses two significant challenges, i. e., the overfitting problem in representation learning and the overestimation bias for expected future rewards.

Offline RL Reinforcement Learning (RL) +2

Paper
Add Code

NaviNeRF: NeRF-based 3D Representation Disentanglement by Latent Semantic Navigation

1 code implementation • ICCV 2023 • Baao Xie, Bohan Li, Zequn Zhang, Junting Dong, Xin Jin, Jingyu Yang, Wenjun Zeng

They are complementary -- the outer navigation is to identify global-view semantic directions, and the inner refinement dedicates to fine-grained attributes.

Disentanglement

Paper
Code

[CLS] Token is All You Need for Zero-Shot Semantic Segmentation

no code implementations • 13 Apr 2023 • Letian Wu, Wenyao Zhang, Tengping Jiang, Wankou Yang, Xin Jin, Wenjun Zeng

Based on that, we build upon the CLIP model as a backbone which we extend with a One-Way [CLS] token navigation from text to the visual branch that enables zero-shot dense prediction, dubbed \textbf{ClsCLIP}.

Few-Shot Semantic Segmentation Language Modelling +4

Paper
Add Code

Inpaint Anything: Segment Anything Meets Image Inpainting

1 code implementation • 13 Apr 2023 • Tao Yu, Runseng Feng, Ruoyu Feng, Jinming Liu, Xin Jin, Wenjun Zeng, Zhibo Chen

We are also very willing to help everyone share and promote new projects based on our Inpaint Anything (IA).

Image Inpainting

5,572

Paper
Code

Bridging Stereo Geometry and BEV Representation with Reliable Mutual Interaction for Semantic Scene Completion

1 code implementation • 24 Mar 2023 • Bohan Li, Yasheng Sun, Zhujin Liang, Dalong Du, Zhuanghui Zhang, XiaoFeng Wang, Yunnan Wang, Xin Jin, Wenjun Zeng

However, due to the inherent representation gap between stereo geometry and BEV features, it is non-trivial to bridge them for dense prediction task of SSC.

3D Semantic Scene Completion Hallucination +2

Paper
Code

Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning

1 code implementation • 26 Jan 2023 • Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng

We present AIRS: Automatic Intrinsic Reward Shaping that intelligently and adaptively provides high-quality intrinsic rewards to enhance exploration in reinforcement learning (RL).

Benchmarking reinforcement-learning +1

977

Paper
Code

Tackling Visual Control via Multi-View Exploration Maximization

no code implementations • 28 Nov 2022 • Mingqi Yuan, Xin Jin, Bo Li, Wenjun Zeng

We present MEM: Multi-view Exploration Maximization for tackling complex visual control tasks.

Benchmarking Reinforcement Learning (RL) +1

Paper
Add Code

Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning

no code implementations • 19 Sep 2022 • Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng

Exploration is critical for deep reinforcement learning in complex environments with high-dimensional observations and sparse rewards.

Atari Games Benchmarking +3

Paper
Add Code

A Nonparametric Contextual Bandit with Arm-level Eligibility Control for Customer Service Routing

no code implementations • 8 Sep 2022 • Ruofeng Wen, Wenjun Zeng, Yi Liu

Routing contacts to eligible SMEs turns out to be a non-trivial problem because SMEs' domain eligibility is subject to training quality and can change over time.

Thompson Sampling

Paper
Add Code

Robust Multi-Object Tracking by Marginal Inference

no code implementations • 7 Aug 2022 • Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, Wenyu Liu

To address the problem, we present an efficient approach to compute a marginal probability for each pair of objects in real time.

Multi-Object Tracking Object

Paper
Add Code

VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual Data

1 code implementation • 20 Jul 2022 • Jiajun Su, Chunyu Wang, Xiaoxuan Ma, Wenjun Zeng, Yizhou Wang

While monocular 3D pose estimation seems to have achieved very accurate results on the public datasets, their generalization ability is largely overlooked.

Ranked #5 on 3D Multi-Person Pose Estimation (absolute) on MuPoTS-3D

3D Multi-Person Pose Estimation (absolute) 3D Pose Estimation

Paper
Code

ReSTR: Convolution-free Referring Image Segmentation Using Transformers

no code implementations • CVPR 2022 • Namyup Kim, Dongwon Kim, Cuiling Lan, Wenjun Zeng, Suha Kwak

Most of existing methods for this task rely heavily on convolutional neural networks, which however have trouble capturing long-range dependencies between entities in the language expression and are not flexible enough for modeling interactions between the two different modalities.

Ranked #12 on Referring Expression Segmentation on RefCoCo val

Image Segmentation Referring Expression Segmentation +2

Paper
Add Code

ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation

no code implementations • ICCV 2023 • Liang Xu, Ziyang Song, Dongliang Wang, Jing Su, Zhicheng Fang, Chenjing Ding, Weihao Gan, Yichao Yan, Xin Jin, Xiaokang Yang, Wenjun Zeng, Wei Wu

We present a GAN-based Transformer for general action-conditioned 3D human motion generation, including not only single-person actions but also multi-person interactive actions.

Paper
Add Code

Correlation-Aware Deep Tracking

1 code implementation • CVPR 2022 • Fei Xie, Chunyu Wang, Guangting Wang, Yue Cao, Wankou Yang, Wenjun Zeng

In contrast to the Siamese-like feature extraction, our network deeply embeds cross-image feature correlation in multiple layers of the feature network.

Feature Correlation Visual Object Tracking

Paper
Code

Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph

2 code implementations • ICLR 2022 • Dacheng Yin, Xuanchi Ren, Chong Luo, Yuwang Wang, Zhiwei Xiong, Wenjun Zeng

Last, an innovative link attention module serves as the decoder to reconstruct data from the decomposed content and style, with the help of the linking keys.

Decoder Quantization +2

Paper
Code

When Shift Operation Meets Vision Transformer: An Extremely Simple Alternative to Attention Mechanism

2 code implementations • 26 Jan 2022 • Guangting Wang, Yucheng Zhao, Chuanxin Tang, Chong Luo, Wenjun Zeng

It can be even replaced by a zero-parameter operation.

Ranked #67 on Object Detection on COCO minival (APM metric)

Image Classification Object Detection +1

2,650

Paper
Code

Lifelong Unsupervised Domain Adaptive Person Re-identification with Coordinated Anti-forgetting and Adaptation

no code implementations • CVPR 2022 • Zhipeng Huang, Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Peng Chu, Quanzeng You, Jiang Wang, Zicheng Liu, Zheng-Jun Zha

In this paper, to address more practical scenarios, we propose a new task, Lifelong Unsupervised Domain Adaptive (LUDA) person ReID.

Domain Adaptive Person Re-Identification Knowledge Distillation +4

Paper
Add Code

SelectAugment: Hierarchical Deterministic Sample Selection for Data Augmentation

no code implementations • 6 Dec 2021 • Shiqi Lin, Zhizheng Zhang, Xin Li, Wenjun Zeng, Zhibo Chen

Data augmentation (DA) has been widely investigated to facilitate model optimization in many tasks.

Data Augmentation Fine-Grained Image Recognition +3

Paper
Add Code

Learning Tracking Representations via Dual-Branch Fully Transformer Networks

1 code implementation • 5 Dec 2021 • Fei Xie, Chunyu Wang, Guangting Wang, Wankou Yang, Wenjun Zeng

We present a Siamese-like Dual-branch network based on solely Transformers for tracking.

Object Tracking

Paper
Code

Confounder Identification-free Causal Visual Feature Learning

no code implementations • 26 Nov 2021 • Xin Li, Zhizheng Zhang, Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Xin Jin, Zhibo Chen

In this paper, we propose a novel Confounder Identification-free Causal Visual Feature Learning (CICF) method, which obviates the need for identifying confounders.

Domain Generalization Meta-Learning

Paper
Add Code

Multi-Scale Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition

no code implementations • 7 Nov 2021 • Pengfei Zhang, Cuiling Lan, Wenjun Zeng, Junliang Xing, Jianru Xue, Nanning Zheng

Skeleton data is of low dimension.

Action Recognition Skeleton Based Action Recognition +1

Paper
Add Code

Skeleton-Based Mutually Assisted Interacted Object Localization and Human Action Recognition

no code implementations • 28 Oct 2021 • Liang Xu, Cuiling Lan, Wenjun Zeng, Cewu Lu

Skeleton data carries valuable motion information and is widely explored in human action recognition.

Action Recognition Object +2

Paper
Add Code

WEDGE: Web-Image Assisted Domain Generalization for Semantic Segmentation

no code implementations • 29 Sep 2021 • Namyup Kim, Taeyoung Son, Jaehyun Pahk, Cuiling Lan, Wenjun Zeng, Suha Kwak

We also present a method which injects styles of the web-crawled images into training images on-the-fly during training, which enables the network to experience images of diverse styles with reliable labels for effective training.

Domain Generalization Segmentation +1

Paper
Add Code

Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?

2 code implementations • 12 Sep 2021 • Chuanxin Tang, Yucheng Zhao, Guangting Wang, Chong Luo, Wenxuan Xie, Wenjun Zeng

Specifically, we replace the MLP module in the token-mixing step with a novel sparse MLP (sMLP) module.

Ranked #394 on Image Classification on ImageNet

Image Classification

192

Paper
Code

Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

1 code implementation • 12 Sep 2021 • Chuanxin Tang, Chong Luo, Zhiyuan Zhao, Dacheng Yin, Yucheng Zhao, Wenjun Zeng

Given a piece of speech and its transcript text, text-based speech editing aims to generate speech that can be seamlessly inserted into the given speech by editing the transcript.

Decoder Voice Conversion

Paper
Code

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

1 code implementation • 30 Aug 2021 • Yucheng Zhao, Guangting Wang, Chuanxin Tang, Chong Luo, Wenjun Zeng, Zheng-Jun Zha

Convolutional neural networks (CNN) are the dominant deep neural network (DNN) architecture for computer vision.

192

Paper
Code

Self-Supervised Visual Representations Learning by Contrastive Mask Prediction

no code implementations • ICCV 2021 • Yucheng Zhao, Guangting Wang, Chong Luo, Wenjun Zeng, Zheng-Jun Zha

In this paper, we propose a novel contrastive mask prediction (CMP) task for visual representation learning and design a mask contrast (MaskCo) framework to implement the idea.

Representation Learning Self-Supervised Learning

Paper
Add Code

VoxelTrack: Multi-Person 3D Human Pose Estimation and Tracking in the Wild

no code implementations • 5 Aug 2021 • Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenyu Liu, Wenjun Zeng

We estimate 3D poses from the voxel representation by predicting whether each voxel contains a particular body joint.

Ranked #7 on 3D Multi-Person Pose Estimation on Panoptic (using extra training data)

3D Multi-Person Pose Estimation 3D Pose Estimation

Paper
Add Code

Pose-Guided Feature Learning with Knowledge Distillation for Occluded Person Re-Identification

no code implementations • 31 Jul 2021 • Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Jiawei Liu, Zhizheng Zhang, Zheng-Jun Zha

Occluded person re-identification (ReID) aims to match person images with occlusion.

Knowledge Distillation Person Re-Identification

Paper
Add Code

Markov Decision Process modeled with Bandits for Sequential Decision Making in Linear-flow

no code implementations • 1 Jul 2021 • Wenjun Zeng, Yi Liu

For marketing, we sometimes need to recommend content for multiple pages in sequence.

Decision Making Marketing +2

Paper
Add Code

ToAlign: Task-oriented Alignment for Unsupervised Domain Adaptation

1 code implementation • NeurIPS 2021 • Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, Zhibo Chen

Unsupervised domain adaptive classifcation intends to improve the classifcation performance on unlabeled target domain.

Unsupervised Domain Adaptation

Paper
Code

PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning

2 code implementations • NeurIPS 2021 • Tao Yu, Cuiling Lan, Wenjun Zeng, Mingxiao Feng, Zhizheng Zhang, Zhibo Chen

In this work, we propose a novel method, dubbed PlayVirtual, which augments cycle-consistent virtual trajectories to enhance the data efficiency for RL feature representation learning.

Ranked #1 on Continuous Control (100k environment steps) on DeepMind Finger Spin (Images)

Continuous Control (100k environment steps) Continuous Control (500k environment steps) +3

Paper
Code

Understanding Mobile GUI: from Pixel-Words to Screen-Sentences

no code implementations • 25 May 2021 • Jingwen Fu, Xiaoyi Zhang, Yuwang Wang, Wenjun Zeng, Sam Yang, Grayson Hilliard

A dataset, RICO-PW, of screenshots with Pixel-Words annotations is built based on the public RICO dataset, which will be released to help to address the lack of high-quality training data in this area.

Retrieval Sentence

Paper
Add Code

Unsupervised Visual Representation Learning by Tracking Patches in Video

1 code implementation • CVPR 2021 • Guangting Wang, Yizhou Zhou, Chong Luo, Wenxuan Xie, Wenjun Zeng, Zhiwei Xiong

The proxy task is to estimate the position and size of the image patch in a sequence of video frames, given only the target bounding box in the first frame.

Action Classification Action Recognition +1

Paper
Code

S2R-DepthNet: Learning a Generalizable Depth-specific Structural Representation

4 code implementations • CVPR 2021 • Xiaotian Chen, Yuwang Wang, Xuejin Chen, Wenjun Zeng

S2R-DepthNet consists of: a) a Structure Extraction (STE) module which extracts a domaininvariant structural representation from an image by disentangling the image into domain-invariant structure and domain-specific style components, b) a Depth-specific Attention (DSA) module, which learns task-specific knowledge to suppress depth-irrelevant structures for better depth estimation and generalization, and c) a depth prediction module (DP) to predict depth from the depth-specific representation.

Depth Prediction Domain Generalization +2

166

Paper
Code

Disentanglement-based Cross-Domain Feature Augmentation for Effective Unsupervised Domain Adaptive Person Re-identification

no code implementations • 25 Mar 2021 • Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Quanzeng You, Zicheng Liu, Kecheng Zheng, Zhibo Chen

Each recomposed feature, obtained based on the domain-invariant feature (which enables a reliable inheritance of identity) and an enhancement from a domain specific feature (which enables the approximation of real distributions), is thus an "ideal" augmentation.

Disentanglement Domain Adaptive Person Re-Identification +2

Paper
Add Code

MetaAlign: Coordinating Domain Alignment and Classification for Unsupervised Domain Adaptation

1 code implementation • CVPR 2021 • Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Zhibo Chen

For unsupervised domain adaptation (UDA), to alleviate the effect of domain shift, many approaches align the source and target domains in the feature space by adversarial learning or by explicitly aligning their statistics.

Classification General Classification +5

Paper
Code

Re-energizing Domain Discriminator with Sample Relabeling for Adversarial Domain Adaptation

no code implementations • ICCV 2021 • Xin Jin, Cuiling Lan, Wenjun Zeng, Zhibo Chen

Many unsupervised domain adaptation (UDA) methods exploit domain adversarial training to align the features to reduce domain gap, where a feature extractor is trained to fool a domain discriminator in order to have aligned feature distributions.

Unsupervised Domain Adaptation

Paper
Add Code

Generalizing to Unseen Domains: A Survey on Domain Generalization

1 code implementation • 2 Mar 2021 • Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Tao Qin, Wang Lu, Yiqiang Chen, Wenjun Zeng, Philip S. Yu

Domain generalization deals with a challenging setting where one or several different but related domain(s) are given, and the goal is to learn a model that can generalize to an unseen test domain.

Domain Generalization Out-of-Distribution Generalization +1

12,910

Paper
Code

Rethinking Content and Style: Exploring Bias for Unsupervised Disentanglement

1 code implementation • 21 Feb 2021 • Xuanchi Ren, Tao Yang, Yuwang Wang, Wenjun Zeng

From the unsupervised disentanglement perspective, we rethink content and style and propose a formulation for unsupervised C-S disentanglement based on our assumption that different factors are of different importance and popularity for image reconstruction, which serves as a data bias.

3D Reconstruction Disentanglement +4

Paper
Code

Learning Disentangled Representation by Exploiting Pretrained Generative Models: A Contrastive Learning View

2 code implementations • ICLR 2022 • Xuanchi Ren, Tao Yang, Yuwang Wang, Wenjun Zeng

Based on this observation, we argue that it is possible to mitigate the trade-off by $(i)$ leveraging the pretrained generative models with high generation quality, $(ii)$ focusing on discovering the traversal directions as factors for disentangled representation learning.

Contrastive Learning Disentanglement

131

Paper
Code

Towards Building A Group-based Unsupervised Representation Disentanglement Framework

1 code implementation • ICLR 2022 • Tao Yang, Xuanchi Ren, Yuwang Wang, Wenjun Zeng, Nanning Zheng

We then propose a model, based on existing VAE-based methods, to tackle the unsupervised learning problem of the framework.

Disentanglement

Paper
Code

AttributeNet: Attribute Enhanced Vehicle Re-Identification

no code implementations • 7 Feb 2021 • Rodolfo Quispe, Cuiling Lan, Wenjun Zeng, Helio Pedrini

Vehicle Re-Identification (V-ReID) is a critical task that associates the same vehicle across images from different camera viewpoints.

Ranked #1 on Vehicle Re-Identification on VeRi-Wild Large

Attribute Vehicle Re-Identification

Paper
Add Code

General-Purpose Speech Representation Learning through a Self-Supervised Multi-Granularity Framework

no code implementations • 3 Feb 2021 • Yucheng Zhao, Dacheng Yin, Chong Luo, Zhiyuan Zhao, Chuanxin Tang, Wenjun Zeng, Zheng-Jun Zha

This paper presents a self-supervised learning framework, named MGF, for general-purpose speech representation learning.

Classification Emotion Classification +6

Paper
Add Code

VAE^2: Preventing Posterior Collapse of Variational Video Predictions in the Wild

no code implementations • 28 Jan 2021 • Yizhou Zhou, Chong Luo, Xiaoyan Sun, Zheng-Jun Zha, Wenjun Zeng

We believe that VAE$^2$ is also applicable to other stochastic sequence prediction problems where training data are lack of stochasticity.

Video Prediction

Paper
Add Code

Style Normalization and Restitution for Domain Generalization and Adaptation

1 code implementation • 3 Jan 2021 • Xin Jin, Cuiling Lan, Wenjun Zeng, Zhibo Chen

In this paper, we design a novel Style Normalization and Restitution module (SNR) to simultaneously ensure both high generalization and discrimination capability of the networks.

Disentanglement Domain Generalization +4

Paper
Code

Exploiting Sample Uncertainty for Domain Adaptive Person Re-Identification

1 code implementation • 16 Dec 2020 • Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, Zheng-Jun Zha

Based on this finding, we propose to exploit the uncertainty (measured by consistency levels) to evaluate the reliability of the pseudo-label of a sample and incorporate the uncertainty to re-weight its contribution within various ReID losses, including the identity (ID) classification loss per sample, the triplet loss, and the contrastive loss.

Clustering Domain Adaptive Person Re-Identification +3

141

Paper
Code

An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

1 code implementation • ICCV 2021 • Rongchang Xie, Chunyu Wang, Wenjun Zeng, Yizhou Wang

The state-of-the-art methods are consistency-based which learn about unlabeled images by encouraging the model to give consistent predictions for images under different augmentations.

Pose Estimation Semi-Supervised Human Pose Estimation

Paper
Code

Re-identification = Retrieval + Verification: Back to Essence and Forward with a New Metric

1 code implementation • 23 Nov 2020 • Zheng Wang, Xin Yuan, Toshihiko Yamasaki, Yutian Lin, Xin Xu, Wenjun Zeng

In essence, current re-ID overemphasizes the importance of retrieval but underemphasizes that of verification, \textit{i. e.}, all returned images are considered as the target.

Image Retrieval Retrieval

108

Paper
Code

AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild

2 code implementations • 26 Oct 2020 • Zhe Zhang, Chunyu Wang, Weichao Qiu, Wenhu Qin, Wenjun Zeng

To make the task truly unconstrained, we present AdaFuse, an adaptive multiview fusion method, which can enhance the features in occluded views by leveraging those in visible views.

Ranked #1 on 3D Human Pose Estimation on Total Capture

3D Human Pose Estimation

Paper
Code

Uncertainty-Aware Few-Shot Image Classification

no code implementations • 9 Oct 2020 • Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Zhibo Chen, Shih-Fu Chang

In this work, we propose Uncertainty-Aware Few-Shot framework for image classification by modeling uncertainty of the similarities of query-support pairs and performing uncertainty-aware optimization.

Classification Few-Shot Image Classification +3

Paper
Add Code

FPCR-Net: Feature Pyramidal Correlation and Residual Reconstruction for Optical Flow Estimation

no code implementations • 17 Jan 2020 • Xiaolin Song, Yuyang Zhao, Jingyu Yang, Cuiling Lan, Wenjun Zeng

To exploit such flexible and comprehensive information, we propose a semi-supervised Feature Pyramidal Correlation and Residual Reconstruction Network (FPCR-Net) for optical flow estimation from frame pairs.

Optical Flow Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.