no code implementations • 18 Mar 2024 • Liang Xu, Yizhou Zhou, Yichao Yan, Xin Jin, Wenhan Zhu, Fengyun Rao, Xiaokang Yang, Wenjun Zeng
Humans constantly interact with their surrounding environments.
no code implementations • 4 Feb 2024 • Xin Jin, Bohan Li, Baao Xie, Wenyao Zhang, Jinming Liu, Ziqiang Li, Tao Yang, Wenjun Zeng
Representation disentanglement may help AI fundamentally understand the real world and thus benefit both discrimination and generation tasks.
1 code implementation • 23 Jan 2024 • Fei Xie, Wankou Yang, Chunyu Wang, Lei Chu, Yue Cao, Chao Ma, Wenjun Zeng
Thus, we reformulate the two-branch Siamese tracking as a conceptually simple, fully transformer-based Single-Branch Tracking pipeline, dubbed SBT.
no code implementations • 26 Dec 2023 • Liang Xu, Xintao Lv, Yichao Yan, Xin Jin, Shuwen Wu, Congsheng Xu, Yifan Liu, Yizhou Zhou, Fengyun Rao, Xingdong Sheng, Yunhui Liu, Wenjun Zeng, Xiaokang Yang
We also equip Inter-X with versatile annotations of more than 34K fine-grained human part-level textual descriptions, semantic interaction categories, interaction order, and the relationship and personality of the subjects.
2 code implementations • 28 Sep 2023 • Mingqi Yuan, Zequn Zhang, Yang Xu, Shihao Luo, Bo Li, Xin Jin, Wenjun Zeng
We present RLLTE: a long-term evolution, extremely modular, and open-source framework for reinforcement learning (RL) research and application.
1 code implementation • 18 Aug 2023 • Xin Li, Yulin Ren, Xin Jin, Cuiling Lan, Xingrui Wang, Wenjun Zeng, Xinchao Wang, Zhibo Chen
Image restoration (IR) has been an indispensable and challenging task in the low-level vision field, which strives to improve the subjective quality of images distorted by various forms of degradation.
no code implementations • 22 Jun 2023 • Bohan Li, Yasheng Sun, Jingxin Dong, Zheng Zhu, Jinming Liu, Xin Jin, Wenjun Zeng
Numerous studies have investigated the pivotal role of reliable 3D volume representation in scene perception tasks, such as multi-view stereo (MVS) and semantic scene completion (SSC).
no code implementations • 24 May 2023 • Qi Wang, Junming Yang, Yunbo Wang, Xin Jin, Wenjun Zeng, Xiaokang Yang
Training offline reinforcement learning (RL) models using visual inputs poses two significant challenges, i. e., the overfitting problem in representation learning and the overestimation bias for expected future rewards.
1 code implementation • ICCV 2023 • Baao Xie, Bohan Li, Zequn Zhang, Junting Dong, Xin Jin, Jingyu Yang, Wenjun Zeng
They are complementary -- the outer navigation is to identify global-view semantic directions, and the inner refinement dedicates to fine-grained attributes.
no code implementations • 13 Apr 2023 • Letian Wu, Wenyao Zhang, Tengping Jiang, Wankou Yang, Xin Jin, Wenjun Zeng
Based on that, we build upon the CLIP model as a backbone which we extend with a One-Way [CLS] token navigation from text to the visual branch that enables zero-shot dense prediction, dubbed \textbf{ClsCLIP}.
1 code implementation • 13 Apr 2023 • Tao Yu, Runseng Feng, Ruoyu Feng, Jinming Liu, Xin Jin, Wenjun Zeng, Zhibo Chen
We are also very willing to help everyone share and promote new projects based on our Inpaint Anything (IA).
1 code implementation • 24 Mar 2023 • Bohan Li, Yasheng Sun, Zhujin Liang, Dalong Du, Zhuanghui Zhang, XiaoFeng Wang, Yunnan Wang, Xin Jin, Wenjun Zeng
However, due to the inherent representation gap between stereo geometry and BEV features, it is non-trivial to bridge them for dense prediction task of SSC.
1 code implementation • 26 Jan 2023 • Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng
We present AIRS: Automatic Intrinsic Reward Shaping that intelligently and adaptively provides high-quality intrinsic rewards to enhance exploration in reinforcement learning (RL).
no code implementations • 28 Nov 2022 • Mingqi Yuan, Xin Jin, Bo Li, Wenjun Zeng
We present MEM: Multi-view Exploration Maximization for tackling complex visual control tasks.
no code implementations • 19 Sep 2022 • Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng
Exploration is critical for deep reinforcement learning in complex environments with high-dimensional observations and sparse rewards.
no code implementations • 8 Sep 2022 • Ruofeng Wen, Wenjun Zeng, Yi Liu
Routing contacts to eligible SMEs turns out to be a non-trivial problem because SMEs' domain eligibility is subject to training quality and can change over time.
no code implementations • 7 Aug 2022 • Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, Wenyu Liu
To address the problem, we present an efficient approach to compute a marginal probability for each pair of objects in real time.
1 code implementation • 20 Jul 2022 • Jiajun Su, Chunyu Wang, Xiaoxuan Ma, Wenjun Zeng, Yizhou Wang
While monocular 3D pose estimation seems to have achieved very accurate results on the public datasets, their generalization ability is largely overlooked.
3D Multi-Person Pose Estimation (absolute) 3D Pose Estimation
no code implementations • CVPR 2022 • Namyup Kim, Dongwon Kim, Cuiling Lan, Wenjun Zeng, Suha Kwak
Most of existing methods for this task rely heavily on convolutional neural networks, which however have trouble capturing long-range dependencies between entities in the language expression and are not flexible enough for modeling interactions between the two different modalities.
Ranked #12 on Referring Expression Segmentation on RefCoCo val
no code implementations • ICCV 2023 • Liang Xu, Ziyang Song, Dongliang Wang, Jing Su, Zhicheng Fang, Chenjing Ding, Weihao Gan, Yichao Yan, Xin Jin, Xiaokang Yang, Wenjun Zeng, Wei Wu
We present a GAN-based Transformer for general action-conditioned 3D human motion generation, including not only single-person actions but also multi-person interactive actions.
1 code implementation • CVPR 2022 • Fei Xie, Chunyu Wang, Guangting Wang, Yue Cao, Wankou Yang, Wenjun Zeng
In contrast to the Siamese-like feature extraction, our network deeply embeds cross-image feature correlation in multiple layers of the feature network.
2 code implementations • ICLR 2022 • Dacheng Yin, Xuanchi Ren, Chong Luo, Yuwang Wang, Zhiwei Xiong, Wenjun Zeng
Last, an innovative link attention module serves as the decoder to reconstruct data from the decomposed content and style, with the help of the linking keys.
2 code implementations • 26 Jan 2022 • Guangting Wang, Yucheng Zhao, Chuanxin Tang, Chong Luo, Wenjun Zeng
It can be even replaced by a zero-parameter operation.
Ranked #67 on Object Detection on COCO minival (APM metric)
no code implementations • CVPR 2022 • Zhipeng Huang, Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Peng Chu, Quanzeng You, Jiang Wang, Zicheng Liu, Zheng-Jun Zha
In this paper, to address more practical scenarios, we propose a new task, Lifelong Unsupervised Domain Adaptive (LUDA) person ReID.
Domain Adaptive Person Re-Identification Knowledge Distillation +4
no code implementations • 6 Dec 2021 • Shiqi Lin, Zhizheng Zhang, Xin Li, Wenjun Zeng, Zhibo Chen
Data augmentation (DA) has been widely investigated to facilitate model optimization in many tasks.
1 code implementation • 5 Dec 2021 • Fei Xie, Chunyu Wang, Guangting Wang, Wankou Yang, Wenjun Zeng
We present a Siamese-like Dual-branch network based on solely Transformers for tracking.
no code implementations • 26 Nov 2021 • Xin Li, Zhizheng Zhang, Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Xin Jin, Zhibo Chen
In this paper, we propose a novel Confounder Identification-free Causal Visual Feature Learning (CICF) method, which obviates the need for identifying confounders.
no code implementations • 7 Nov 2021 • Pengfei Zhang, Cuiling Lan, Wenjun Zeng, Junliang Xing, Jianru Xue, Nanning Zheng
Skeleton data is of low dimension.
no code implementations • 28 Oct 2021 • Liang Xu, Cuiling Lan, Wenjun Zeng, Cewu Lu
Skeleton data carries valuable motion information and is widely explored in human action recognition.
no code implementations • 29 Sep 2021 • Namyup Kim, Taeyoung Son, Jaehyun Pahk, Cuiling Lan, Wenjun Zeng, Suha Kwak
We also present a method which injects styles of the web-crawled images into training images on-the-fly during training, which enables the network to experience images of diverse styles with reliable labels for effective training.
2 code implementations • 12 Sep 2021 • Chuanxin Tang, Yucheng Zhao, Guangting Wang, Chong Luo, Wenxuan Xie, Wenjun Zeng
Specifically, we replace the MLP module in the token-mixing step with a novel sparse MLP (sMLP) module.
Ranked #394 on Image Classification on ImageNet
1 code implementation • 12 Sep 2021 • Chuanxin Tang, Chong Luo, Zhiyuan Zhao, Dacheng Yin, Yucheng Zhao, Wenjun Zeng
Given a piece of speech and its transcript text, text-based speech editing aims to generate speech that can be seamlessly inserted into the given speech by editing the transcript.
1 code implementation • 30 Aug 2021 • Yucheng Zhao, Guangting Wang, Chuanxin Tang, Chong Luo, Wenjun Zeng, Zheng-Jun Zha
Convolutional neural networks (CNN) are the dominant deep neural network (DNN) architecture for computer vision.
no code implementations • ICCV 2021 • Yucheng Zhao, Guangting Wang, Chong Luo, Wenjun Zeng, Zheng-Jun Zha
In this paper, we propose a novel contrastive mask prediction (CMP) task for visual representation learning and design a mask contrast (MaskCo) framework to implement the idea.
no code implementations • 5 Aug 2021 • Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenyu Liu, Wenjun Zeng
We estimate 3D poses from the voxel representation by predicting whether each voxel contains a particular body joint.
Ranked #7 on 3D Multi-Person Pose Estimation on Panoptic (using extra training data)
no code implementations • 31 Jul 2021 • Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Jiawei Liu, Zhizheng Zhang, Zheng-Jun Zha
Occluded person re-identification (ReID) aims to match person images with occlusion.
no code implementations • 1 Jul 2021 • Wenjun Zeng, Yi Liu
For marketing, we sometimes need to recommend content for multiple pages in sequence.
1 code implementation • NeurIPS 2021 • Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, Zhibo Chen
Unsupervised domain adaptive classifcation intends to improve the classifcation performance on unlabeled target domain.
2 code implementations • NeurIPS 2021 • Tao Yu, Cuiling Lan, Wenjun Zeng, Mingxiao Feng, Zhizheng Zhang, Zhibo Chen
In this work, we propose a novel method, dubbed PlayVirtual, which augments cycle-consistent virtual trajectories to enhance the data efficiency for RL feature representation learning.
Continuous Control (100k environment steps) Continuous Control (500k environment steps) +3
no code implementations • 25 May 2021 • Jingwen Fu, Xiaoyi Zhang, Yuwang Wang, Wenjun Zeng, Sam Yang, Grayson Hilliard
A dataset, RICO-PW, of screenshots with Pixel-Words annotations is built based on the public RICO dataset, which will be released to help to address the lack of high-quality training data in this area.
1 code implementation • CVPR 2021 • Guangting Wang, Yizhou Zhou, Chong Luo, Wenxuan Xie, Wenjun Zeng, Zhiwei Xiong
The proxy task is to estimate the position and size of the image patch in a sequence of video frames, given only the target bounding box in the first frame.
4 code implementations • CVPR 2021 • Xiaotian Chen, Yuwang Wang, Xuejin Chen, Wenjun Zeng
S2R-DepthNet consists of: a) a Structure Extraction (STE) module which extracts a domaininvariant structural representation from an image by disentangling the image into domain-invariant structure and domain-specific style components, b) a Depth-specific Attention (DSA) module, which learns task-specific knowledge to suppress depth-irrelevant structures for better depth estimation and generalization, and c) a depth prediction module (DP) to predict depth from the depth-specific representation.
no code implementations • 25 Mar 2021 • Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Quanzeng You, Zicheng Liu, Kecheng Zheng, Zhibo Chen
Each recomposed feature, obtained based on the domain-invariant feature (which enables a reliable inheritance of identity) and an enhancement from a domain specific feature (which enables the approximation of real distributions), is thus an "ideal" augmentation.
1 code implementation • CVPR 2021 • Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Zhibo Chen
For unsupervised domain adaptation (UDA), to alleviate the effect of domain shift, many approaches align the source and target domains in the feature space by adversarial learning or by explicitly aligning their statistics.
no code implementations • ICCV 2021 • Xin Jin, Cuiling Lan, Wenjun Zeng, Zhibo Chen
Many unsupervised domain adaptation (UDA) methods exploit domain adversarial training to align the features to reduce domain gap, where a feature extractor is trained to fool a domain discriminator in order to have aligned feature distributions.
1 code implementation • 2 Mar 2021 • Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Tao Qin, Wang Lu, Yiqiang Chen, Wenjun Zeng, Philip S. Yu
Domain generalization deals with a challenging setting where one or several different but related domain(s) are given, and the goal is to learn a model that can generalize to an unseen test domain.
1 code implementation • 21 Feb 2021 • Xuanchi Ren, Tao Yang, Yuwang Wang, Wenjun Zeng
From the unsupervised disentanglement perspective, we rethink content and style and propose a formulation for unsupervised C-S disentanglement based on our assumption that different factors are of different importance and popularity for image reconstruction, which serves as a data bias.
2 code implementations • ICLR 2022 • Xuanchi Ren, Tao Yang, Yuwang Wang, Wenjun Zeng
Based on this observation, we argue that it is possible to mitigate the trade-off by $(i)$ leveraging the pretrained generative models with high generation quality, $(ii)$ focusing on discovering the traversal directions as factors for disentangled representation learning.
1 code implementation • ICLR 2022 • Tao Yang, Xuanchi Ren, Yuwang Wang, Wenjun Zeng, Nanning Zheng
We then propose a model, based on existing VAE-based methods, to tackle the unsupervised learning problem of the framework.
no code implementations • 7 Feb 2021 • Rodolfo Quispe, Cuiling Lan, Wenjun Zeng, Helio Pedrini
Vehicle Re-Identification (V-ReID) is a critical task that associates the same vehicle across images from different camera viewpoints.
Ranked #1 on Vehicle Re-Identification on VeRi-Wild Large
no code implementations • 3 Feb 2021 • Yucheng Zhao, Dacheng Yin, Chong Luo, Zhiyuan Zhao, Chuanxin Tang, Wenjun Zeng, Zheng-Jun Zha
This paper presents a self-supervised learning framework, named MGF, for general-purpose speech representation learning.
no code implementations • 28 Jan 2021 • Yizhou Zhou, Chong Luo, Xiaoyan Sun, Zheng-Jun Zha, Wenjun Zeng
We believe that VAE$^2$ is also applicable to other stochastic sequence prediction problems where training data are lack of stochasticity.
1 code implementation • 3 Jan 2021 • Xin Jin, Cuiling Lan, Wenjun Zeng, Zhibo Chen
In this paper, we design a novel Style Normalization and Restitution module (SNR) to simultaneously ensure both high generalization and discrimination capability of the networks.
1 code implementation • 16 Dec 2020 • Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, Zheng-Jun Zha
Based on this finding, we propose to exploit the uncertainty (measured by consistency levels) to evaluate the reliability of the pseudo-label of a sample and incorporate the uncertainty to re-weight its contribution within various ReID losses, including the identity (ID) classification loss per sample, the triplet loss, and the contrastive loss.
1 code implementation • ICCV 2021 • Rongchang Xie, Chunyu Wang, Wenjun Zeng, Yizhou Wang
The state-of-the-art methods are consistency-based which learn about unlabeled images by encouraging the model to give consistent predictions for images under different augmentations.
1 code implementation • 23 Nov 2020 • Zheng Wang, Xin Yuan, Toshihiko Yamasaki, Yutian Lin, Xin Xu, Wenjun Zeng
In essence, current re-ID overemphasizes the importance of retrieval but underemphasizes that of verification, \textit{i. e.}, all returned images are considered as the target.
2 code implementations • 26 Oct 2020 • Zhe Zhang, Chunyu Wang, Weichao Qiu, Wenhu Qin, Wenjun Zeng
To make the task truly unconstrained, we present AdaFuse, an adaptive multiview fusion method, which can enhance the features in occluded views by leveraging those in visible views.
Ranked #1 on 3D Human Pose Estimation on Total Capture
no code implementations • 9 Oct 2020 • Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Zhibo Chen, Shih-Fu Chang
In this work, we propose Uncertainty-Aware Few-Shot framework for image classification by modeling uncertainty of the similarities of query-support pairs and performing uncertainty-aware optimization.
no code implementations • 17 Jan 2020 • Xiaolin Song, Yuyang Zhao, Jingyu Yang, Cuiling Lan, Wenjun Zeng
To exploit such flexible and comprehensive information, we propose a semi-supervised Feature Pyramidal Correlation and Residual Reconstruction Network (FPCR-Net) for optical flow estimation from frame pairs.