no code implementations • 25 May 2022 • Yuxing Chen, Renshu Gu, Ouhan Huang, Gangyong Jia
The proposed VTP framework integrates the high performance of the transformer with volumetric representations, which can be used as a good alternative to the convolutional backbones.
Ranked #4 on 3D Human Pose Estimation on Panoptic (using extra training data)
no code implementations • 27 Jan 2022 • Yingchao Pan, Ouhan Huang, Qinghao Ye, Zhongjin Li, Wenjiang Wang, Guodun Li, Yuxing Chen
By combining these two attention mechanisms, a video SUMmarization model with Diversified Contextual Attention scheme is developed, namely SUM-DCA.