no code implementations • 21 Apr 2024 • Liheng Ma, Soumyasundar Pal, Yitian Zhang, Jiaming Zhou, Yingxue Zhang, Mark Coates
The existing definitions of graph convolution, either from spatial or spectral perspectives, are inflexible and not unified.
1 code implementation • 3 Mar 2024 • Kun-Yu Lin, Henghui Ding, Jiaming Zhou, Yi-Xing Peng, Zhilin Zhao, Chen Change Loy, Wei-Shi Zheng
To answer this, we establish a CROSS-domain Open-Vocabulary Action recognition benchmark named XOV-Action, and conduct a comprehensive evaluation of five state-of-the-art CLIP-based video learners under various types of domain gaps.
no code implementations • 22 Jan 2024 • Jiaming Zhou, Junwei Liang, Kun-Yu Lin, Jinrui Yang, Wei-Shi Zheng
With the proposed ActionHub dataset, we further propose a novel Cross-modality and Cross-action Modeling (CoCo) framework for ZSAR, which consists of a Dual Cross-modality Alignment module and a Cross-action Invariance Mining module.
no code implementations • 29 Nov 2023 • Jinhui Ye, Jiaming Zhou, Hui Xiong, Junwei Liang
Specifically, at the core of GeoDeformer is the Geometric Deformation Predictor, a module designed to identify and quantify potential spatial and temporal geometric deformations within the given video.
no code implementations • 28 Nov 2023 • Jiaming Zhou, Hanjun Li, Kun-Yu Lin, Junwei Liang
Under the weak supervision setting, action labels are provided for the whole video without precise start and end times of the action clip.
Ranked #1 on Long-video Activity Recognition on Breakfast
1 code implementation • 4 Oct 2023 • Yujin Tang, Jiaming Zhou, Xiang Pan, Zeying Gong, Junwei Liang
To address these limitations, we introduce the PostRainBench, a comprehensive multi-variable NWP post-processing benchmark consisting of three datasets for NWP post-processing-based precipitation forecasting.
no code implementations • 26 Jul 2023 • Tian-Hao Zhang, Dinghao Zhou, Guiping Zhong, Jiaming Zhou, Baoxiang Li
RNN-T models are widely used in ASR, which rely on the RNN-T loss to achieve length alignment between input audio and target sequence.
no code implementations • 22 Feb 2023 • Jiaming Zhou, Shiwan Zhao, Ning Jiang, Guoqing Zhao, Yong Qin
Unsupervised domain adaptation (UDA) aims to improve the performance on the unlabeled target domain by transferring knowledge from the source to the target domain.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • CVPR 2021 • Jiaming Zhou, Kun-Yu Lin, Haoxin Li, Wei-Shi Zheng
In this paper, we propose a Graph-based High-order Relation Modeling (GHRM) module to exploit the high-order relations in the long-term actions for long-term action recognition.
Ranked #5 on Video Classification on Breakfast