Search Results for author: Ailing Zeng

Found 43 papers, 27 papers with code

AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation

no code implementations • 26 Mar 2024 • Qingping Sun, Yanjun Wang, Ailing Zeng, Wanqi Yin, Chen Wei, Wenjia Wang, Haiyi Mei, Chi Sing Leung, Ziwei Liu, Lei Yang, Zhongang Cai

Expressive human pose and shape estimation (a. k. a.

Human Detection

Paper
Add Code

Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks

1 code implementation • 25 Jan 2024 • Tianhe Ren, Shilong Liu, Ailing Zeng, Jing Lin, Kunchang Li, He Cao, Jiayu Chen, Xinyu Huang, Yukang Chen, Feng Yan, Zhaoyang Zeng, Hao Zhang, Feng Li, Jie Yang, Hongyang Li, Qing Jiang, Lei Zhang

We introduce Grounded SAM, which uses Grounding DINO as an open-set object detector to combine with the segment anything model (SAM).

Segmentation

13,621

Paper
Code

GPAvatar: Generalizable and Precise Head Avatar from Image(s)

1 code implementation • 18 Jan 2024 • Xuangeng Chu, Yu Li, Ailing Zeng, Tianyu Yang, Lijian Lin, Yunfei Liu, Tatsuya Harada

Head avatar reconstruction, crucial for applications in virtual reality, online meetings, gaming, and film industries, has garnered substantial attention within the computer vision community.

Neural Rendering Novel View Synthesis

230

Paper
Code

DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture Generation

no code implementations • 9 Jan 2024 • Junming Chen, Yunfei Liu, Jianan Wang, Ailing Zeng, Yu Li, Qifeng Chen

We propose DiffSHEG, a Diffusion-based approach for Speech-driven Holistic 3D Expression and Gesture generation with arbitrary length.

Computational Efficiency Gesture Generation

Paper
Add Code

DPoser: Diffusion Model as Robust 3D Human Pose Prior

1 code implementation • 9 Dec 2023 • Junzhe Lu, Jing Lin, Hongkun Dou, Ailing Zeng, Yue Deng, Yulun Zhang, Haoqian Wang

Our approach demonstrates considerable enhancements over common uniform scheduling used in image domains, boasting improvements of 5. 4%, 17. 2%, and 3. 8% across human mesh recovery, pose completion, and motion denoising, respectively.

Denoising Human Mesh Recovery +1

Paper
Code

PhysHOI: Physics-Based Imitation of Dynamic Human-Object Interaction

no code implementations • 7 Dec 2023 • Yinhuai Wang, Jing Lin, Ailing Zeng, Zhengyi Luo, Jian Zhang, Lei Zhang

To make up for the lack of dynamic HOI scenarios in this area, we introduce the BallPlay dataset that contains eight whole-body basketball skills.

Human-Object Interaction Detection Object

Paper
Add Code

HumanTOMATO: Text-aligned Whole-body Motion Generation

no code implementations • 19 Oct 2023 • Shunlin Lu, Ling-Hao Chen, Ailing Zeng, Jing Lin, Ruimao Zhang, Lei Zhang, Heung-Yeung Shum

This work targets a novel text-driven whole-body motion generation task, which takes a given textual description as input and aims at generating high-quality, diverse, and coherent facial expressions, hand gestures, and body motions simultaneously.

Paper
Add Code

UniPose: Detecting Any Keypoints

1 code implementation • 12 Oct 2023 • Jie Yang, Ailing Zeng, Ruimao Zhang, Lei Zhang

This work proposes a unified framework called UniPose to detect keypoints of any articulated (e. g., human and animal), rigid, and soft objects via visual or textual prompts for fine-grained vision understanding and manipulation.

Ranked #1 on 2D Human Pose Estimation on Human-Art (using extra training data)

2D Human Pose Estimation 2D Pose Estimation +4

239

Paper
Code

Bridging the Gap between Human Motion and Action Semantics via Kinematic Phrases

no code implementations • 6 Oct 2023 • Xinpeng Liu, Yong-Lu Li, Ailing Zeng, Zizheng Zhou, Yang You, Cewu Lu

The goal of motion understanding is to establish a reliable mapping between motion and action semantics, while it is a challenging many-to-many problem.

Paper
Add Code

Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code

1 code implementation • 2 Oct 2023 • Xuan Ju, Ailing Zeng, Yuxuan Bian, Shaoteng Liu, Qiang Xu

Specifically, in the context of diffusion-based editing, where a source image is edited according to a target prompt, the process commences by acquiring a noisy latent vector corresponding to the source image via the diffusion model.

Ranked #3 on Text-based Image Editing on PIE-Bench

Image Generation Text-based Image Editing

189

Paper
Code

FreeMan: Towards Benchmarking 3D Human Pose Estimation under Real-World Conditions

1 code implementation • 10 Sep 2023 • Jiong Wang, Fengyu Yang, Wenbo Gou, Bingliang Li, Danqi Yan, Ailing Zeng, Yijun Gao, Junle Wang, Yanqing Jing, Ruimao Zhang

To facilitate the development of 3D pose estimation, we present FreeMan, the first large-scale, multi-view dataset collected under the real-world conditions.

3D Human Pose Estimation 3D Pose Estimation +1

Paper
Code

Neural Interactive Keypoint Detection

1 code implementation • ICCV 2023 • Jie Yang, Ailing Zeng, Feng Li, Shilong Liu, Ruimao Zhang, Lei Zhang

Click-Pose explores how user feedback can cooperate with a neural keypoint detector to correct the predicted keypoints in an interactive way for a faster and more effective annotation process.

Decoder Keypoint Detection

Paper
Code

Effective Whole-body Pose Estimation with Two-stages Distillation

1 code implementation • 29 Jul 2023 • Zhendong Yang, Ailing Zeng, Chun Yuan, Yu Li

Different from the previous self-knowledge distillation, this stage finetunes the student's head with only 20% training time as a plug-and-play training strategy.

Ranked #1 on 2D Human Pose Estimation on COCO-WholeBody (using extra training data)

2D Human Pose Estimation Pose Estimation +1

1,910

Paper
Code

FITS: Modeling Time Series with $10k$ Parameters

1 code implementation • 6 Jul 2023 • Zhijian Xu, Ailing Zeng, Qiang Xu

In this paper, we introduce FITS, a lightweight yet powerful model for time series analysis.

Anomaly Detection Time Series +1

Paper
Code

Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset

1 code implementation • NeurIPS 2023 • Jing Lin, Ailing Zeng, Shunlin Lu, Yuanhao Cai, Ruimao Zhang, Haoqian Wang, Lei Zhang

In this paper, we present Motion-X, a large-scale 3D expressive whole-body motion dataset.

Human Mesh Recovery text annotation

445

Paper
Code

detrex: Benchmarking Detection Transformers

1 code implementation • 12 Jun 2023 • Tianhe Ren, Shilong Liu, Feng Li, Hao Zhang, Ailing Zeng, Jie Yang, Xingyu Liao, Ding Jia, Hongyang Li, He Cao, Jianan Wang, Zhaoyang Zeng, Xianbiao Qi, Yuhui Yuan, Jianwei Yang, Lei Zhang

To address this issue, we develop a unified, highly modular, and lightweight codebase called detrex, which supports a majority of the mainstream DETR-based instance recognition algorithms, covering various fundamental tasks, including object detection, segmentation, and pose estimation.

Benchmarking object-detection +2

1,834

Paper
Code

Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model

1 code implementation • 20 May 2023 • Jie Yang, Bingliang Li, Fengyu Yang, Ailing Zeng, Lei Zhang, Ruimao Zhang

Extensive experiments demonstrate that DiffHOI significantly outperforms the state-of-the-art in regular detection (i. e., 41. 50 mAP) and zero-shot detection.

Ranked #2 on Zero-Shot Human-Object Interaction Detection on HICO-DET (using extra training data)

Human-Object Interaction Detection Zero-Shot Human-Object Interaction Detection

Paper
Code

A Strong and Reproducible Object Detector with Only Public Datasets

3 code implementations • 25 Apr 2023 • Tianhe Ren, Jianwei Yang, Shilong Liu, Ailing Zeng, Feng Li, Hao Zhang, Hongyang Li, Zhaoyang Zeng, Lei Zhang

This work presents Focal-Stable-DINO, a strong and reproducible object detection model which achieves 64. 6 AP on COCO val2017 and 64. 8 AP on COCO test-dev using only 700M parameters without any test time augmentation.

Ranked #5 on Object Detection on COCO minival (using extra training data)

object-detection Object Detection

651

Paper
Code

HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation

1 code implementation • ICCV 2023 • Xuan Ju, Ailing Zeng, Chenchen Zhao, Jianan Wang, Lei Zhang, Qiang Xu

While such a plug-and-play approach is appealing, the inevitable and uncertain conflicts between the original images produced from the frozen SD branch and the given condition incur significant challenges for the learnable branch, which essentially conducts image feature editing for condition enforcement.

Denoising Image Generation

250

Paper
Code

One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer

1 code implementation • CVPR 2023 • Jing Lin, Ailing Zeng, Haoqian Wang, Lei Zhang, Yu Li

It is challenging to perform this task with a single network due to resolution issues, i. e., the face and hands are usually located in extremely small regions.

Ranked #3 on 3D Human Pose Estimation on UBody

3D Human Pose Estimation 3D Human Reconstruction +2

573

Paper
Code

From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels

1 code implementation • ICCV 2023 • Zhendong Yang, Ailing Zeng, Zhe Li, Tianke Zhang, Chun Yuan, Yu Li

We decompose the KD loss and find the non-target loss from it forces the student's non-target logits to match the teacher's, but the sum of the two non-target logits is different, preventing them from being identical.

Self-Knowledge Distillation

192

Paper
Code

Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR

1 code implementation • 13 Mar 2023 • Feng Li, Ailing Zeng, Shilong Liu, Hao Zhang, Hongyang Li, Lei Zhang, Lionel M. Ni

Recent DEtection TRansformer-based (DETR) models have obtained remarkable performance.

object-detection Object Detection

176

Paper
Code

Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes

1 code implementation • CVPR 2023 • Xuan Ju, Ailing Zeng, Jianan Wang, Qiang Xu, Lei Zhang

Humans have long been recorded in a variety of forms since antiquity.

3D Human Pose Estimation Human Detection +1

193

Paper
Code

Introducing Depth into Transformer-based 3D Object Detection

no code implementations • 25 Feb 2023 • Hao Zhang, Hongyang Li, Ailing Zeng, Feng Li, Shilong Liu, Xingyu Liao, Lei Zhang

To address the second issue, we introduce an auxiliary learning task called Depth-aware Negative Suppression loss.

3D Object Detection Auxiliary Learning +3

Paper
Add Code

FrAug: Frequency Domain Augmentation for Time Series Forecasting

no code implementations • 18 Feb 2023 • Muxi Chen, Zhijian Xu, Ailing Zeng, Qiang Xu

In time series forecasting (TSF), we need to model the fine-grained temporal relationship within time series segments to generate accurate forecasting results given data in a look-back window.

Anomaly Detection Data Augmentation +3

Paper
Add Code

Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation

3 code implementations • 3 Feb 2023 • Jie Yang, Ailing Zeng, Shilong Liu, Feng Li, Ruimao Zhang, Lei Zhang

This paper presents a novel end-to-end framework with Explicit box Detection for multi-person Pose estimation, called ED-Pose, where it unifies the contextual learning between human-level (global) and keypoint-level (local) information.

Ranked #2 on 2D Human Pose Estimation on Human-Art

2D Human Pose Estimation Decoder +4

139

Paper
Code

Lite DETR: An Interleaved Multi-Scale Encoder for Efficient DETR

no code implementations • CVPR 2023 • Feng Li, Ailing Zeng, Shilong Liu, Hao Zhang, Hongyang Li, Lei Zhang, Lionel M. Ni

Recent DEtection TRansformer-based (DETR) models have obtained remarkable performance.

object-detection Object Detection

Paper
Add Code

ViTKD: Practical Guidelines for ViT feature knowledge distillation

1 code implementation • 6 Sep 2022 • Zhendong Yang, Zhe Li, Ailing Zeng, Zexian Li, Chun Yuan, Yu Li

In this paper, we explore the way of feature-based distillation for ViT.

Image Classification Knowledge Distillation

192

Paper
Code

Are Transformers Effective for Time Series Forecasting?

4 code implementations • 26 May 2022 • Ailing Zeng, Muxi Chen, Lei Zhang, Qiang Xu

Recently, there has been a surge of Transformer-based solutions for the long-term time series forecasting (LTSF) task.

Ranked #1 on Time Series Forecasting on ETTh1 (96) Univariate

Anomaly Detection Temporal Relation Extraction +2

1,801

Paper
Code

HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling

no code implementations • 28 Apr 2022 • Zhongang Cai, Daxuan Ren, Ailing Zeng, Zhengyu Lin, Tao Yu, Wenjia Wang, Xiangyu Fan, Yang Gao, Yifan Yu, Liang Pan, Fangzhou Hong, Mingyuan Zhang, Chen Change Loy, Lei Yang, Ziwei Liu

4D human sensing and modeling are fundamental tasks in vision and graphics with numerous applications.

Fine-grained Action Recognition Pose Estimation

Paper
Add Code

DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation

1 code implementation • 16 Mar 2022 • Ailing Zeng, Xuan Ju, Lei Yang, Ruiyuan Gao, Xizhou Zhu, Bo Dai, Qiang Xu

This paper proposes a simple baseline framework for video-based 2D/3D human pose estimation that can achieve 10 times efficiency improvement over existing works without any performance degradation, named DeciWatch.

Ranked #1 on 2D Human Pose Estimation on JHMDB (2D poses only)

2D Human Pose Estimation 3D Human Pose Estimation +2

169

Paper
Code

SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos

2 code implementations • 27 Dec 2021 • Ailing Zeng, Lei Yang, Xuan Ju, Jiefeng Li, Jianyi Wang, Qiang Xu

With a simple yet effective motion-aware fully-connected network, SmoothNet improves the temporal smoothness of existing pose estimators significantly and enhances the estimation accuracy of those challenging frames as a side-effect.

2D Human Pose Estimation 3D Human Pose Estimation +2

5,052

Paper
Code

T-WaveNet: A Tree-Structured Wavelet Neural Network for Time Series Signal Analysis

no code implementations • ICLR 2022 • Minhao Liu, Ailing Zeng, Qiuxia Lai, Ruiyuan Gao, Min Li, Jing Qin, Qiang Xu

In this work, we propose a novel tree-structured wavelet neural network for time series signal analysis, namely T-WaveNet, by taking advantage of an inherent property of various types of signals, known as the dominant frequency range.

Activity Recognition Representation Learning +3

Paper
Add Code

Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation

no code implementations • ICCV 2021 • Ailing Zeng, Xiao Sun, Lei Yang, Nanxuan Zhao, Minhao Liu, Qiang Xu

While the average prediction accuracy has been improved significantly over the years, the performance on hard poses with depth ambiguity, self-occlusion, and complex or rare poses is still far from satisfactory.

Ranked #23 on Skeleton Based Action Recognition on NTU RGB+D 120

3D Human Pose Estimation 3D Pose Estimation +3

Paper
Add Code

Information Bottleneck Approach to Spatial Attention Learning

1 code implementation • 7 Aug 2021 • Qiuxia Lai, Yu Li, Ailing Zeng, Minhao Liu, Hanqiu Sun, Qiang Xu

Extensive experiments show that the proposed IB-inspired spatial attention mechanism can yield attention maps that neatly highlight the regions of interest while suppressing backgrounds, and bootstrap standard DNN structures for visual recognition tasks (e. g., image classification, fine-grained recognition, cross-domain classification).

Decision Making domain classification +1

Paper
Code

Human Pose Regression with Residual Log-likelihood Estimation

3 code implementations • ICCV 2021 • Jiefeng Li, Siyuan Bian, Ailing Zeng, Can Wang, Bo Pang, Wentao Liu, Cewu Lu

In light of this, we propose a novel regression paradigm with Residual Log-likelihood Estimation (RLE) to capture the underlying output distribution.

Ranked #59 on 3D Human Pose Estimation on Human3.6M

3D Human Pose Estimation Multi-Person Pose Estimation +1

5,052

Paper
Code

SCINet: Time Series Modeling and Forecasting with Sample Convolution and Interaction

4 code implementations • 17 Jun 2021 • Minhao Liu, Ailing Zeng, Muxi Chen, Zhijian Xu, Qiuxia Lai, Lingna Ma, Qiang Xu

One unique property of time series is that the temporal relations are largely preserved after downsampling into two sub-sequences.

Ranked #1 on Time Series Forecasting on ETTh1 (24) Multivariate (using extra training data)

Time Series Traffic Prediction +1

692

Paper
Code

Relational Graph Neural Network Design via Progressive Neural Architecture Search

no code implementations • 30 May 2021 • Ailing Zeng, Minhao Liu, Zhiwei Liu, Ruiyuan Gao, Jing Qin, Qiang Xu

We propose a novel solution to addressing a long-standing dilemma in the representation learning of graph neural networks (GNNs): how to effectively capture and represent useful information embedded in long-distance nodes to improve the performance of nodes with low homophily without leading to performance degradation in nodes with high homophily.

Neural Architecture Search Node Classification +1

Paper
Add Code

Skimming and Scanning for Untrimmed Video Action Recognition

no code implementations • 21 Apr 2021 • Yunyan Hong, Ailing Zeng, Min Li, Cewu Lu, Li Jiang, Qiang Xu

Video action recognition (VAR) is a primary task of video understanding, and untrimmed videos are more common in real-life scenes.

Action Recognition Temporal Action Localization +1

Paper
Add Code

T-WaveNet: Tree-Structured Wavelet Neural Network for Sensor-Based Time Series Analysis

no code implementations • 10 Dec 2020 • Minhao Liu, Ailing Zeng, Qiuxia Lai, Qiang Xu

Motivated by the fact that usually a small subset of the frequency components carries the primary information for sensor data, we propose a novel tree-structured wavelet neural network for sensor data analysis, namely \emph{T-WaveNet}.

Activity Recognition Brain Computer Interface +5

Paper
Add Code

SRNet: Improving Generalization in 3D Human Pose Estimation with a Split-and-Recombine Approach

1 code implementation • ECCV 2020 • Ailing Zeng, Xiao Sun, Fuyang Huang, Minhao Liu, Qiang Xu, Stephen Lin

With the reduced dimensionality of less relevant body areas, the training set distribution within network branches more closely reflects the statistics of local poses instead of global body poses, without sacrificing information important for joint inference.

Ranked #20 on Monocular 3D Human Pose Estimation on Human3.6M

Monocular 3D Human Pose Estimation

Paper
Code

DeepFuse: An IMU-Aware Network for Real-Time 3D Human Pose Estimation from Multi-View Image

no code implementations • 9 Dec 2019 • Fuyang Huang, Ailing Zeng, Minhao Liu, Qiuxia Lai, Qiang Xu

In this paper, we propose a two-stage fully 3D network, namely \textbf{DeepFuse}, to estimate human pose in 3D space by fusing body-worn Inertial Measurement Unit (IMU) data and multi-view images deeply.

Ranked #5 on 3D Human Pose Estimation on Total Capture

3D Human Pose Estimation 3D Pose Estimation

Paper
Add Code

Structure-Aware 3D Hourglass Network for Hand Pose Estimation from Single Depth Image

no code implementations • 26 Dec 2018 • Fuyang Huang, Ailing Zeng, Minhao Liu, Jing Qin, Qiang Xu

Experimental results show that the proposed structure-aware 3D hourglass network is able to achieve a mean joint error of 7. 4 mm in MSRA and 8. 9 mm in NYU datasets, respectively.

Hand Pose Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.