no code implementations • 2 Mar 2023 • Bo Wan, Yongfei Liu, Desen Zhou, Tinne Tuytelaars, Xuming He
Human object interaction (HOI) detection plays a crucial role in human-centric scene understanding and serves as a fundamental building-block for many vision tasks.
Human-Object Interaction Detection Knowledge Distillation +3
no code implementations • 25 Feb 2023 • Zhichao Liu, Leshan Wang, Desen Zhou, Jian Wang, Songyang Zhang, Yang Bai, Errui Ding, Rui Fan
To deal with these issues, we propose an attention based approach which we call \textit{temporal segment transformer}, for joint segment relation modeling and denoising.
no code implementations • 19 Aug 2022 • Tailin Chen, Desen Zhou, Jian Wang, Shidong Wang, Qian He, Chuanyang Hu, Errui Ding, Yu Guan, Xuming He
In this paper, we study the problem of one-shot skeleton-based action recognition, which poses unique challenges in learning transferable representation from base classes to novel classes, particularly for fine-grained actions.
1 code implementation • 19 Jul 2022 • Yang Bai, Desen Zhou, Songyang Zhang, Jian Wang, Errui Ding, Yu Guan, Yang Long, Jingdong Wang
Action Quality Assessment(AQA) is important for action understanding and resolving the task poses unique challenges due to subtle visual differences.
no code implementations • CVPR 2022 • Desen Zhou, Zhichao Liu, Jian Wang, Leshan Wang, Tao Hu, Errui Ding, Jingdong Wang
To associate the predictions of disentangled decoders, we first generate a unified representation for HOI triplets with a base decoder, and then utilize it as input feature of each disentangled decoder.
no code implementations • 17 Apr 2022 • Zhichao Liu, Liyue Qian, Wenke Jing, Desen Zhou, Xuming He, Edmond Lou, Rui Zheng
The framework consisted of two closely linked modules: 1) the lamina detector for identifying and locating each lamina pairs on ultrasound coronal images, and 2) the spinal curvature estimator for calculating the scoliotic angles based on the chain of detected lamina.
no code implementations • 1 Nov 2021 • Tailin Chen, Shidong Wang, Desen Zhou, Yu Guan
We devise our model into a pure factorised architecture which can alternately perform spatial feature aggregation and temporal feature aggregation.
1 code implementation • 9 Sep 2021 • Qian He, Desen Zhou, Bo Wan, Xuming He
To address those challenges, we adopt a primitive-based representation for 3D object, and propose a two-stage graph network for primitive-based 3D object estimation, which consists of a sequential proposal module and a graph reasoning module.
1 code implementation • 10 Aug 2021 • Tailin Chen, Desen Zhou, Jian Wang, Shidong Wang, Yu Guan, Xuming He, Errui Ding
The task of skeleton-based action recognition remains a core challenge in human-centred scene understanding due to the multiple granularities and large variation in human motion.
1 code implementation • ICCV 2019 • Bo Wan, Desen Zhou, Yongfei Liu, Rongjie Li, Xuming He
Reasoning human object interactions is a core problem in human-centric scene understanding and detecting such relations poses a unique challenge to vision systems due to large variations in human-object configurations, multiple co-occurring relation instances and subtle visual difference between relation categories.
5 code implementations • Conference 2016 • Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, Yi Ma
To this end, we have proposed a simple but effective Multi-column Convolutional Neural Network (MCNN) architecture to map the image to its crowd density map.
Ranked #5 on Crowd Counting on Venice