no code implementations • 21 May 2024 • Junfeng Hu, Xu Liu, Zhencheng Fan, Yifang Yin, Shili Xiang, Savitha Ramasamy, Roger Zimmermann
To bridge this gap, we propose Spatio-Temporal Graph Prompting (STGP), a prompt-enhanced transfer learning framework capable of adapting to diverse tasks in data-scarce domains.
1 code implementation • 28 Apr 2024 • Jinming Cao, Sicheng Shen, Qiu Zhou, Yifang Yin, Yangyan Li, Roger Zimmermann
Interestingly, we find that the Shape information effectively captures the moir\'e patterns in artifact images.
no code implementations • 2 Jan 2024 • Liping Huang, Sheng Zhang, YiCheng Zhang, Yi Zhang, Yifang Yin
Aircraft landing time (ALT) prediction is crucial for air traffic management, especially for arrival aircraft sequencing on the runway.
1 code implementation • 26 Aug 2023 • Qiu Zhou, Jinming Cao, Hanchao Leng, Yifang Yin, Yu Kun, Roger Zimmermann
This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems.
no code implementations • 19 Aug 2023 • Yichen Zhang, Yifang Yin, Ying Zhang, Zhenguang Liu, Zheng Wang, Roger Zimmermann
Early detection of dysplasia of the cervix is critical for cervical cancer treatment.
no code implementations • 7 Jul 2023 • Wenmiao Hu, Yichen Zhang, Yuxuan Liang, Yifang Yin, Andrei Georgescu, An Tran, Hannes Kruppa, See-Kiong Ng, Roger Zimmermann
Street-view imagery provides us with novel experiences to explore different places remotely.
Ranked #3 on Image-Based Localization on cvact
1 code implementation • 21 Mar 2023 • Sahil Goyal, Shagun Uppal, Sarthak Bhagat, Yi Yu, Yifang Yin, Rajiv Ratn Shah
To mitigate this, we build a talking face generation framework conditioned on a categorical emotion to generate videos with appropriate expressions, making them more realistic and convincing.
Ranked #1 on Talking Face Generation on CREMA-D
no code implementations • ICCV 2023 • Yifang Yin, Wenmiao Hu, Zhenguang Liu, Guanfeng Wang, Shili Xiang, Roger Zimmermann
Source-free domain adaptive semantic segmentation has gained increasing attention recently.
no code implementations • ACM Multimedia Asia 2022 • Sahil Goyal, Shagun Uppal, Sarthak Bhagat, Dhroov Goel, Sakshat Mali, Yi Yu, Yifang Yin, Rajiv Ratn Shah
Lip synchronization and talking face generation have gained a specific interest from the research community with the advent and need of digital communication in different fields.
no code implementations • 2 Apr 2022 • Yichen Zhang, Yifang Yin, Ying Zhang, Roger Zimmermann
Contrastive self-supervised learning has attracted significant research attention recently.
no code implementations • 7 Jan 2022 • Pengxiang Su, Zhenguang Liu, Shuang Wu, Lei Zhu, Yifang Yin, Xuanjing Shen
In this paper, we introduce a novel convolutional neural model to effectively leverage explicit prior knowledge of motion anatomy, and simultaneously capture both spatial and temporal information of joint trajectory dynamics.
no code implementations • 16 Sep 2021 • Junfeng Hu, Yuxuan Liang, Zhencheng Fan, Li Liu, Yifang Yin, Roger Zimmermann
Specifically, we introduce a joint spatiotemporal graph attention network to learn the relations across space and time for short-term patterns.
1 code implementation • 6 Aug 2020 • Meng-Jiun Chiou, Zhenguang Liu, Yifang Yin, An-An Liu, Roger Zimmermann
In this paper, we propose a novel neural network based architecture Graph Location Networks (GLN) to perform infrastructure-free, multi-view image based indoor localization.
no code implementations • 12 Jun 2020 • Dhruva Sahrawat, Yaman Kumar, Shashwat Aggarwal, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann
To close the gap between speech understanding and multimedia video applications, in this paper, we show the initial experiments by modelling the perception on visual speech and showing its use case on video compression.
1 code implementation • 7 May 2020 • Vishaal Udandarao, Abhishek Maiti, Deepak Srivatsav, Suryatej Reddy Vyalla, Yifang Yin, Rajiv Ratn Shah
In this paper, we present a novel framework COBRA that aims to train two modalities (image and text) in a joint fashion inspired by the Contrastive Predictive Coding (CPC) and Noise Contrastive Estimation (NCE) paradigms which preserve both inter and intra-class relationships.
no code implementations • 28 Jun 2019 • Yaman Kumar, Rohit Jain, Khwaja Mohd. Salik, Rajiv Ratn Shah, Yifang Yin, Roger Zimmermann
The model takes silent videos as input and produces speech as the output.
1 code implementation • 29 Jan 2019 • Yaman Kumar, Dhruva Sahrawat, Shubham Maheshwari, Debanjan Mahata, Amanda Stent, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann
To solve this problem, we present a novel approach to zero-shot learning by generating new classes using Generative Adversarial Networks (GANs), and show how the addition of unseen class samples increases the accuracy of a VSR system by a significant margin of 27% and allows it to handle speaker-independent out-of-vocabulary phrases.
no code implementations • 16 Jul 2018 • Mayank Meghawat, Satyendra Yadav, Debanjan Mahata, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann
In this work, we propose a multimodal dataset consisiting of content, context, and social information for popularity prediction.