no code implementations • 24 Mar 2024 • Xin Gu, Libo Zhang, Fan Chen, Longyin Wen, YuFei Wang, Tiejian Luo, Sijie Zhu
Each video in our dataset is rendered by various image/video materials with a single editing component, which supports atomic visual understanding of different editing components.
1 code implementation • 3 Jan 2024 • Xin Gu, Heng Fan, Yan Huang, Tiejian Luo, Libo Zhang
The key of CG-STVG lies in two specially designed modules, including instance context generation (ICG), which focuses on discovering visual context information (in both appearance and motion) of the instance, and instance context refinement (ICR), which aims to improve the instance context from ICG by eliminating irrelevant or even harmful information from the context.
Ranked #1 on Spatio-Temporal Video Grounding on HC-STVG1
1 code implementation • 27 Sep 2023 • Libo Zhang, Xin Gu, CongCong Li, Tiejian Luo, Heng Fan
Specifically, we use lightweight ConvNets to extract features of the P-frames in the GOPs and spatial-channel attention module (SCAM) is designed to refine the feature representations of the P-frames based on the compressed information with bidirectional information flow.
1 code implementation • ICCV 2023 • Yaojie Shen, Xin Gu, Kai Xu, Heng Fan, Longyin Wen, Libo Zhang
Addressing this, we study video captioning from a different perspective in compressed domain, which brings multi-fold advantages over the existing pipeline: 1) Compared to raw images from the decoded video, the compressed video, consisting of I-frames, motion vectors and residuals, is highly distinguishable, which allows us to leverage the entire video for learning without manual sampling through a specialized model design; 2) The captioning model is more efficient in inference as smaller and less redundant information is processed.
Ranked #8 on Video Captioning on VATEX
no code implementations • 25 Apr 2023 • Renteng Yuan, Mohamed Abdel-Aty, Xin Gu, Ou Zheng, Qiaojun Xiang
The results indicate that the classification accuracy of LC intention was improved from 96. 14% to 98. 20% when incorporating the attention mechanism into the TCN model.
no code implementations • CVPR 2023 • Xin Gu, Guang Chen, YuFei Wang, Libo Zhang, Tiejian Luo, Longyin Wen
Meanwhile, the internal stream is designed to exploit the multi-modality information in videos (e. g., the appearance of video frames, speech transcripts, and video captions) to ensure the quality of caption results.
Ranked #7 on Video Captioning on YouCook2
no code implementations • 2 Mar 2023 • Xin Gu, Gautam Kamath, Zhiwei Steven Wu
We give an algorithm for selecting a public dataset by measuring a low-dimensional subspace distance between gradients of the public and private examples.
1 code implementation • 7 Jul 2022 • Xin Gu, Hanhua Ye, Guang Chen, YuFei Wang, Libo Zhang, Longyin Wen
This paper describes our champion solution for the CVPR2022 Generic Event Boundary Captioning (GEBC) competition.
no code implementations • 17 May 2022 • Shaleeza Sohail, Zongwen Fan, Xin Gu, Fariza Sabrina
Also, selection of right hyperparameters for ANN architecture plays a crucial role in the accurate detection of security attacks, especially when it come to identifying the subcategories of attacks.
2 code implementations • 18 Nov 2019 • Joris Mulder, Xin Gu, Anton Olsson-Collentine, Andrew Tomarken, Florian Böing-Messing, Herbert Hoijtink, Marlyne Meijerink, Donald R. Williams, Janosch Menke, Jean-Paul Fox, Yves Rosseel, Eric-Jan Wagenmakers, Caspar van Lissa
There has been a tremendous methodological development of Bayes factors for hypothesis testing in the social and behavioral sciences, and related fields.
Computation