Search Results for author: Yifang Yin

Found 17 papers, 5 papers with code

ShapeMoiré: Channel-Wise Shape-Guided Network for Image Demoiréing

no code implementations28 Apr 2024 Jinming Cao, Sicheng Shen, Qiu Zhou, Yifang Yin, Yangyan Li, Roger Zimmermann

Interestingly, we find that the Shape information effectively captures the moir\'e patterns in artifact images.

Aircraft Landing Time Prediction with Deep Learning on Trajectory Images

no code implementations2 Jan 2024 Liping Huang, Sheng Zhang, YiCheng Zhang, Yi Zhang, Yifang Yin

Aircraft landing time (ALT) prediction is crucial for air traffic management, especially for arrival aircraft sequencing on the runway.

SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection

1 code implementation26 Aug 2023 Qiu Zhou, Jinming Cao, Hanchao Leng, Yifang Yin, Yu Kun, Roger Zimmermann

This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems.

3D Object Detection Autonomous Driving +2

Emotionally Enhanced Talking Face Generation

1 code implementation21 Mar 2023 Sahil Goyal, Shagun Uppal, Sarthak Bhagat, Yi Yu, Yifang Yin, Rajiv Ratn Shah

To mitigate this, we build a talking face generation framework conditioned on a categorical emotion to generate videos with appropriate expressions, making them more realistic and convincing.

Talking Face Generation Talking Head Generation

Emotional Talking Faces: Making Videos More Expressive and Realistic

no code implementations ACM Multimedia Asia 2022 Sahil Goyal, Shagun Uppal, Sarthak Bhagat, Dhroov Goel, Sakshat Mali, Yi Yu, Yifang Yin, Rajiv Ratn Shah

Lip synchronization and talking face generation have gained a specific interest from the research community with the advent and need of digital communication in different fields.

Talking Face Generation

Motion Prediction via Joint Dependency Modeling in Phase Space

no code implementations7 Jan 2022 Pengxiang Su, Zhenguang Liu, Shuang Wu, Lei Zhu, Yifang Yin, Xuanjing Shen

In this paper, we introduce a novel convolutional neural model to effectively leverage explicit prior knowledge of motion anatomy, and simultaneously capture both spatial and temporal information of joint trajectory dynamics.

Anatomy motion prediction

Decoupling Long- and Short-Term Patterns in Spatiotemporal Inference

no code implementations16 Sep 2021 Junfeng Hu, Yuxuan Liang, Zhencheng Fan, Li Liu, Yifang Yin, Roger Zimmermann

Specifically, we introduce a joint spatiotemporal graph attention network to learn the relations across space and time for short-term patterns.

Graph Attention

Zero-Shot Multi-View Indoor Localization via Graph Location Networks

1 code implementation6 Aug 2020 Meng-Jiun Chiou, Zhenguang Liu, Yifang Yin, An-An Liu, Roger Zimmermann

In this paper, we propose a novel neural network based architecture Graph Location Networks (GLN) to perform infrastructure-free, multi-view image based indoor localization.

Indoor Localization

"Notic My Speech" -- Blending Speech Patterns With Multimedia

no code implementations12 Jun 2020 Dhruva Sahrawat, Yaman Kumar, Shashwat Aggarwal, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann

To close the gap between speech understanding and multimedia video applications, in this paper, we show the initial experiments by modelling the perception on visual speech and showing its use case on video compression.

speech-recognition Video Compression +1

COBRA: Contrastive Bi-Modal Representation Algorithm

1 code implementation7 May 2020 Vishaal Udandarao, Abhishek Maiti, Deepak Srivatsav, Suryatej Reddy Vyalla, Yifang Yin, Rajiv Ratn Shah

In this paper, we present a novel framework COBRA that aims to train two modalities (image and text) in a joint fashion inspired by the Contrastive Predictive Coding (CPC) and Noise Contrastive Estimation (NCE) paradigms which preserve both inter and intra-class relationships.

Cross-Modal Retrieval Image Captioning +3

Harnessing GANs for Zero-shot Learning of New Classes in Visual Speech Recognition

1 code implementation29 Jan 2019 Yaman Kumar, Dhruva Sahrawat, Shubham Maheshwari, Debanjan Mahata, Amanda Stent, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann

To solve this problem, we present a novel approach to zero-shot learning by generating new classes using Generative Adversarial Networks (GANs), and show how the addition of unseen class samples increases the accuracy of a VSR system by a significant margin of 27% and allows it to handle speaker-independent out-of-vocabulary phrases.

speech-recognition Visual Speech Recognition +1

A Multimodal Approach to Predict Social Media Popularity

no code implementations16 Jul 2018 Mayank Meghawat, Satyendra Yadav, Debanjan Mahata, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann

In this work, we propose a multimodal dataset consisiting of content, context, and social information for popularity prediction.

Cannot find the paper you are looking for? You can Submit a new open access paper.