no code implementations • 9 May 2024 • Zixuan Tang, Youjun Zhao, Yuhang Wen, Mengyuan Liu
Action recognition is a key technology in building interactive metaverses.
no code implementations • 9 May 2024 • Sheng Yan, Xin Du, Zongying Li, Yi Wang, Hongcang Jin, Mengyuan Liu
Temporal grounding is crucial in multimodal learning, but it poses challenges when applied to animal behavior data due to the sparsity and uniform distribution of moments.
1 code implementation • 2 May 2024 • Wenshuo Chen, Hongru Xiao, Erhang Zhang, Lijie Hu, Lei Wang, Mengyuan Liu, Chen Chen
We present a methodology for constructing an SATO that satisfies the stability of attention and prediction.
1 code implementation • 30 Apr 2024 • Yue Li, Baiqiao Yin, Jinfu Liu, Jiajun Wen, Jiaying Lin, Mengyuan Liu
In recent years, Event Sound Source Localization has been widely applied in various fields.
1 code implementation • 25 Apr 2024 • Jiaying Lin, Jiajun Wen, Mengyuan Liu, Jinfu Liu, Baiqiao Yin, Yue Li
The task of spatiotemporal action localization in chaotic scenes is a challenging task toward advanced video understanding.
1 code implementation • 24 Apr 2024 • Jinfu Liu, Baiqiao Yin, Jiaying Lin, Jiajun Wen, Yue Li, Mengyuan Liu
Skeleton-based action recognition has gained considerable traction thanks to its utilization of succinct and robust skeletal representations.
1 code implementation • 21 Apr 2024 • Sheng Yan, Mengyuan Liu, Yong Wang, Yang Liu, Chen Chen, Hong Liu
In this paper, we address the unexplored question of temporal sentence localization in human motions (TSLM), aiming to locate a target moment from a 3D human motion that semantically corresponds to a text query.
1 code implementation • 18 Apr 2024 • Mengyuan Liu, Zhongbin Fang, Xia Li, Joachim M. Buhmann, Xiangtai Li, Chen Change Loy
With the emergence of large-scale models trained on diverse datasets, in-context learning has emerged as a promising paradigm for multitasking, notably in natural language processing and image processing.
1 code implementation • 17 Apr 2024 • Zhichao Deng, Xiangtai Li, Xia Li, Yunhai Tong, Shen Zhao, Mengyuan Liu
By transferring the knowledge of the VLM to the 4D encoder and combining the VLM, our VG4D achieves improved recognition performance.
no code implementations • 13 Mar 2024 • Peini Guo, Mengyuan Liu, Hong Liu, Ruijia Fan, Guoquan Wang, Bin He
In addition, a Multi-scale Constraint Block (MCB) is designed, which extracts fine-grained identity-related features and effectively transfers cloth-irrelevant knowledge.
no code implementations • 4 Feb 2024 • Ti Wang, Mengyuan Liu, Hong Liu, Bin Ren, Yingxuan You, Wenhao Li, Nicu Sebe, Xia Li
We observe that previous optimization-based methods commonly rely on projection constraint, which only ensures alignment in 2D space, potentially leading to the overfitting problem.
no code implementations • 4 Feb 2024 • Mengyuan Liu, Chen Chen, Songtao Wu, Fanyang Meng, Hong Liu
Recognizing interactive actions, including hand-to-hand interaction and human-to-human interaction, has attracted increasing attention for various applications in the field of video analysis and human-robot interaction.
no code implementations • 4 Feb 2024 • Bin Ren, Yawei Li, Jingyun Liang, Rakesh Ranjan, Mengyuan Liu, Rita Cucchiara, Luc van Gool, Nicu Sebe
While it is crucial to capture global information for effective image restoration (IR), integrating such cues into transformer-based methods becomes computationally expensive, especially with high input resolution.
1 code implementation • 18 Jan 2024 • Xuan Wang, Mengyuan Liu
Recent advances in single-image 3D face reconstruction have shown remarkable progress in various applications.
1 code implementation • 16 Jan 2024 • Zhongbin Fang, Xia Li, Xiangtai Li, Shen Zhao, Mengyuan Liu
Through extensive experiments, we demonstrate that our PointMLS achieves state-of-the-art results on ModelNet-O and competitive results on regular datasets, and it is robust and effective.
1 code implementation • 10 Jan 2024 • Hongbo Kang, Yong Wang, Mengyuan Liu, Doudou Wu, Peng Liu, Xinlin Yuan, Wenming Yang
To address these two challenges, we propose a diffusion-based refinement framework called DRPose, which refines the output of deterministic models by reverse diffusion and achieves more suitable multi-hypothesis prediction for the current pose benchmark by multi-step refinement with multiple noises.
1 code implementation • CAAI Transactions on Intelligence Technology 2023 • Jinfu Liu, Runwei Ding, Yuhang Wen, Nan Dai, Fanyang Meng, Shen Zhao, Mengyuan Liu
Multimodal-based action recognition methods have achieved high success using pose and RGB modality.
Ranked #4 on Action Recognition on NTU RGB+D 120
no code implementations • 4 Jan 2024 • Yiheng Liu, Hao He, Tianle Han, Xu Zhang, Mengyuan Liu, Jiaming Tian, Yutong Zhang, Jiaqi Wang, Xiaohui Gao, Tianyang Zhong, Yi Pan, Shaochen Xu, Zihao Wu, Zhengliang Liu, Xin Zhang, Shu Zhang, Xintao Hu, Tuo Zhang, Ning Qiang, Tianming Liu, Bao Ge
Low-cost training and deployment of LLMs represent the future development trend.
1 code implementation • 19 Dec 2023 • Pengxiang Ding, Qiongjie Cui, Min Zhang, Mengyuan Liu, Haofan Wang, Donglin Wang
Human motion forecasting, with the goal of estimating future human behavior over a period of time, is a fundamental task in many real-world applications.
1 code implementation • 19 Dec 2023 • Xinshun Wang, Qiongjie Cui, Chen Chen, Mengyuan Liu
The past few years has witnessed the dominance of Graph Convolutional Networks (GCNs) over human motion prediction. Various styles of graph convolutions have been proposed, with each one meticulously designed and incorporated into a carefully-crafted network architecture.
1 code implementation • 6 Dec 2023 • Xinshun Wang, Zhongbin Fang, Xia Li, Xiangtai Li, Chen Chen, Mengyuan Liu
Under this setting, the model can perceive tasks from prompts and accomplish them without any extra task-specific head predictions or model fine-tuning.
no code implementations • 29 Nov 2023 • Xinshun Wang, Wanying Zhang, Can Wang, Yuan Gao, Mengyuan Liu
Graph Convolutional Networks (GCN) which typically follows a neural message passing framework to model dependencies among skeletal joints has achieved high success in skeleton-based human motion prediction task.
1 code implementation • 23 Nov 2023 • Wanying Zhang, Shen Zhao, Fanyang Meng, Songtao Wu, Mengyuan Liu
With potential applications in fields including intelligent surveillance and human-robot interaction, the human motion prediction task has become a hot research topic and also has achieved high success, especially using the recent Graph Convolutional Network (GCN).
1 code implementation • 20 Nov 2023 • Wenhao Li, Mengyuan Liu, Hong Liu, Pichao Wang, Jialun Cai, Nicu Sebe
Transformers have been successfully applied in the field of video-based 3D human pose estimation.
1 code implementation • 25 Sep 2023 • Yang Liu, Chen Chen, Can Wang, Xulin King, Mengyuan Liu
The proposed method decouples functions between the decoder and the encoder by introducing a mask regressor, which predicts the masked patch representation from the visible patch representation encoded by the encoder and the decoder reconstructs the target from the predicted masked patch representation.
Ranked #3 on Few-Shot 3D Point Cloud Classification on ModelNet40 10-way (20-shot) (using extra training data)
1 code implementation • ICCV 2023 • Lijun Li, Linrui Tian, Xindi Zhang, Qi Wang, Bang Zhang, Mengyuan Liu, Chen Chen
The current interacting hand (IH) datasets are relatively simplistic in terms of background and texture, with hand joints being annotated by a machine annotator, which may result in inaccuracies, and the diversity of pose distribution is limited.
1 code implementation • 10 Aug 2023 • Hongbo Kang, Yong Wang, Mengyuan Liu, Doudou Wu, Peng Liu, Wenming Yang
Notably, our model achieves state-of-the-art performance on all action categories in the Human3. 6M dataset using detected 2D poses from CPN, and our code is available at: https://github. com/KHB1698/DC-GCT.
Ranked #44 on 3D Human Pose Estimation on Human3.6M
1 code implementation • 8 Aug 2023 • Yi Zhang, Youjun Zhao, Yuhang Wen, Zixuan Tang, Xinhua Xu, Mengyuan Liu
To solve this problem, this paper tries to formulate a new task called micro-expression generation and then presents a strong baseline which combines the first order motion model with facial prior knowledge.
no code implementations • 26 Jul 2023 • Xinshun Wang, Qiongjie Cui, Chen Chen, Shen Zhao, Mengyuan Liu
Existing Graph Convolutional Networks to achieve human motion prediction largely adopt a one-step scheme, which output the prediction straight from history input, failing to exploit human motion patterns.
1 code implementation • 16 Jul 2023 • Runwei Ding, Yuhang Wen, Jinfu Liu, Nan Dai, Fanyang Meng, Mengyuan Liu
We propose an Integrating Human Parsing and Pose Network (IPP-Net) for action recognition, which is the first to leverage both skeletons and human parsing feature maps in dual-branch approach.
Ranked #7 on Action Recognition on NTU RGB+D 120
1 code implementation • 15 Jul 2023 • Tianyu Guo, Mengyuan Liu, Hong Liu, Wenhao Li, Jingwen Guo, Tao Wang, Yidi Li
Considering the instance-level discriminative ability, contrastive learning methods, including MoCo and SimCLR, have been adapted from the original image representation learning task to solve the self-supervised skeleton-based action recognition task.
1 code implementation • 14 Jul 2023 • Yuhang Wen, Zixuan Tang, Yunsheng Pang, Beichen Ding, Mengyuan Liu
To address these problems, we propose an Interactive Spatiotemporal Token Attention Network (ISTA-Net), which simultaneously model spatial, temporal, and interactive relations.
Human Interaction Recognition Skeleton Based Action Recognition
1 code implementation • 25 Jun 2023 • Linhui Dai, Hong Liu, Pinhao Song, Mengyuan Liu
Firstly, a real-time UIE method is employed to generate enhanced images, which can improve the visibility of objects in low-contrast areas.
2 code implementations • NeurIPS 2023 • Zhongbin Fang, Xiangtai Li, Xia Li, Joachim M. Buhmann, Chen Change Loy, Mengyuan Liu
With the rise of large-scale models trained on broad data, in-context learning has become a new learning paradigm that has demonstrated significant potential in natural language processing and computer vision tasks.
1 code implementation • 7 May 2023 • Sheng Yan, Yang Liu, Haoqiang Wang, Xin Du, Mengyuan Liu, Hong Liu
On the latest HumanML3D dataset, we achieve a recall of 62. 9% for motion retrieval and 71. 5% for text retrieval (both based on R@10).
1 code implementation • 1 May 2023 • Yilei Hua, Wenhan Wu, Ce Zheng, Aidong Lu, Mengyuan Liu, Chen Chen, Shiqian Wu
This paper proposes an attention-based contrastive learning framework for skeleton representation learning, called SkeAttnCLR, which integrates local similarity and global features for skeleton-based action representations.
1 code implementation • IEEE Transactions on Multimedia 2023 • Jinfu Liu, Xinshun Wang, Can Wang, Yuan Gao, Mengyuan Liu
Then, channel-dependent and temporal-dependent adjacency matrices corresponding to different channels and frames are calculated to capture the spatiotemporal dependencies between skeleton joints.
no code implementations • 7 Apr 2023 • Xinshun Wang, Qiongjie Cui, Chen Chen, Shen Zhao, Mengyuan Liu
In recent years, Graph Convolutional Networks (GCNs) have been widely used in human motion prediction, but their performance remains unsatisfactory.
Ranked #3 on Human Pose Forecasting on Human3.6M
2 code implementations • CVPR 2023 • Qitao Zhao, Ce Zheng, Mengyuan Liu, Pichao Wang, Chen Chen
However, in real scenarios, the performance of PoseFormer and its follow-ups is limited by two factors: (a) The length of the input joint sequence; (b) The quality of 2D joint detection.
Ranked #8 on 3D Human Pose Estimation on MPI-INF-3DHP
no code implementations • 23 Mar 2023 • Ce Zheng, Xianpeng Liu, Mengyuan Liu, Tianfu Wu, Guo-Jun Qi, Chen Chen
While image-based HMR methods have achieved impressive results, they often struggle to recover humans in dynamic scenarios, leading to temporal inconsistencies and non-smooth 3D motion predictions due to the absence of human motion.
Ranked #56 on 3D Human Pose Estimation on 3DPW
no code implementations • 3 Mar 2023 • Tao Wang, Mengyuan Liu, Hong Liu, Wenhao Li, Miaoju Ban, Tuanyu Guo, Yidi Li
In this paper, different from most previous works that discard the occluded region, we propose a Feature Completion Transformer (FCFormer) to implicitly complement the semantic information of occluded parts in the feature space.
no code implementations • 10 Jan 2023 • Mengyi Zhao, Mengyuan Liu, Bin Ren, Shuling Dai, Nicu Sebe
Diffusion-based generative models have recently emerged as powerful solutions for high-quality synthesis in multiple domains.
1 code implementation • 7 Dec 2021 • Tianyu Guo, Hong Liu, Zhan Chen, Mengyuan Liu, Tao Wang, Runwei Ding
In this paper, to make better use of the movement patterns introduced by extreme augmentations, a Contrastive Learning framework utilizing Abundant Information Mining for self-supervised action Representation (AimCLR) is proposed.
1 code implementation • 26 Mar 2021 • Wenhao Li, Hong Liu, Runwei Ding, Mengyuan Liu, Pichao Wang, Wenming Yang
The modified VTE is termed as Strided Transformer Encoder (STE), which is built upon the outputs of VTE.
Ranked #2 on 3D Human Pose Estimation on HumanEva-I
no code implementations • 14 Feb 2020 • Bin Ren, Mengyuan Liu, Runwei Ding, Hong Liu
To the best of our knowledge, this research represents the first comprehensive discussion of deep learning-based action recognition using 3D skeleton data.
1 code implementation • 20 Nov 2019 • Chen Chen, Mengyuan Liu, Xiandong Meng, Wanpeng Xiao, Qi Ju
Therefore, high efficiency object detectors on CPU-only devices are urgently-needed in industry.
1 code implementation • 6 Nov 2018 • Hanrong Ye, Xia Li, Hong Liu, Wei Shi, Mengyuan Liu, Qianru Sun
Rain removal aims to extract and remove rain streaks from images.
no code implementations • ECCV 2018 • Junwu Weng, Mengyuan Liu, Xudong Jiang, Junsong Yuan
This deformable convolution can better utilize contextual joints for action and gesture recognition and is more robust to noisy joints.
no code implementations • CVPR 2018 • Mengyuan Liu, Junsong Yuan
Specifically, the evolution of pose estimation maps can be decomposed as an evolution of heatmaps, e. g., probabilistic maps, and an evolution of estimated 2D human poses, which denote the changes of body shape and body pose, respectively.
Ranked #1 on Multimodal Activity Recognition on UTD-MHAD
no code implementations • 4 Dec 2017 • Mengyuan Liu, Hong Liu, Chen Chen
Then, motion and shape cues are jointly used to generate robust and distinctive spatial-temporal interest points (STIPs): motion-based STIPs and shape-based STIPs.
no code implementations • Pattern Recognition 2017 • Mengyuan Liu, Hong Liu, Chen Chen
First, a sequence-based view invariant transform is developed to eliminate the effect of view variations on spatio-temporal locations of skeleton joints.
Ranked #2 on Skeleton Based Action Recognition on UWA3D
no code implementations • 23 May 2017 • Hong Liu, Juanhui Tu, Mengyuan Liu
Extensive experiments on the SmartHome dataset and the large-scale NTU RGB-D dataset demonstrate that our method outperforms most of RNN-based methods, which verify the complementary property between spatial and temporal information and the robustness to noise.
Skeleton Based Action Recognition Vocal Bursts Valence Prediction