1 code implementation • 3 Mar 2024 • Kun-Yu Lin, Henghui Ding, Jiaming Zhou, Yi-Xing Peng, Zhilin Zhao, Chen Change Loy, Wei-Shi Zheng
To answer this, we establish a CROSS-domain Open-Vocabulary Action recognition benchmark named XOV-Action, and conduct a comprehensive evaluation of five state-of-the-art CLIP-based video learners under various types of domain gaps.
no code implementations • 22 Jan 2024 • Jiaming Zhou, Junwei Liang, Kun-Yu Lin, Jinrui Yang, Wei-Shi Zheng
With the proposed ActionHub dataset, we further propose a novel Cross-modality and Cross-action Modeling (CoCo) framework for ZSAR, which consists of a Dual Cross-modality Alignment module and a Cross-action Invariance Mining module.
no code implementations • 28 Nov 2023 • Jiaming Zhou, Hanjun Li, Kun-Yu Lin, Junwei Liang
Under the weak supervision setting, action labels are provided for the whole video without precise start and end times of the action clip.
Ranked #1 on Long-video Activity Recognition on Breakfast
no code implementations • ICCV 2023 • An-Lan Wang, Kun-Yu Lin, Jia-Run Du, Jingke Meng, Wei-Shi Zheng
In this work, we focus on the task of procedure planning from instructional videos with text supervision, where a model aims to predict an action sequence to transform the initial visual state into the goal visual state.
1 code implementation • 3 Feb 2023 • Jiayu Jiao, Yu-Ming Tang, Kun-Yu Lin, Yipeng Gao, Jinhua Ma, YaoWei Wang, Wei-Shi Zheng
In this work, we explore effective Vision Transformers to pursue a preferable trade-off between the computational complexity and size of the attended receptive field.
1 code implementation • CVPR 2023 • Yipeng Gao, Kun-Yu Lin, Junkai Yan, YaoWei Wang, Wei-Shi Zheng
Critically, in FSDAOD, the data-scarcity in the target domain leads to an extreme data imbalance between the source and target domains, which potentially causes over-adaptation in traditional feature alignment.
no code implementations • CVPR 2023 • Zuhao Liu, Xiao-Ming Wu, Dian Zheng, Kun-Yu Lin, Wei-Shi Zheng
There also exists a scene gap between virtual and real scenarios, including scene-specific anomalies (events that are abnormal in one scene but normal in another) and scene-specific attributes, such as the viewpoint of the surveillance camera.
Anomaly Detection In Surveillance Videos Video Anomaly Detection
1 code implementation • 22 Jun 2022 • Jia-Run Du, Jia-Chang Feng, Kun-Yu Lin, Fa-Ting Hong, Xiao-Ming Wu, Zhongang Qi, Ying Shan, Wei-Shi Zheng
Accordingly, we first exclude these surely non-existent categories by a complementary learning loss.
1 code implementation • 19 Jun 2022 • Zhilin Zhao, Longbing Cao, Kun-Yu Lin
We thus improve the discriminability of a pretrained network by finetuning it with out-of-distribution samples drawn from the cross-class vicinity distribution, where each out-of-distribution input corresponds to a complementary label.
no code implementations • 19 Jun 2022 • Zhilin Zhao, Longbing Cao, Kun-Yu Lin
To tackle this issue, several state-of-the-art methods include adding extra OOD samples to training and assign them with manually-defined labels.
Out-of-Distribution Detection Out of Distribution (OOD) Detection
1 code implementation • 23 Aug 2021 • Zhilin Zhao, Longbing Cao, Kun-Yu Lin
According to the Shannon entropy, an energy-based implicit generator is inferred from a discriminator without extra training costs.
no code implementations • CVPR 2021 • Jiaming Zhou, Kun-Yu Lin, Haoxin Li, Wei-Shi Zheng
In this paper, we propose a Graph-based High-order Relation Modeling (GHRM) module to exploit the high-order relations in the long-term actions for long-term action recognition.
Ranked #5 on Video Classification on Breakfast