Search Results for author: Yifang Yin

Found 17 papers, 5 papers with code

ShapeMoiré: Channel-Wise Shape-Guided Network for Image Demoiréing

no code implementations • 28 Apr 2024 • Jinming Cao, Sicheng Shen, Qiu Zhou, Yifang Yin, Yangyan Li, Roger Zimmermann

Interestingly, we find that the Shape information effectively captures the moir\'e patterns in artifact images.

Paper
Add Code

Aircraft Landing Time Prediction with Deep Learning on Trajectory Images

no code implementations • 2 Jan 2024 • Liping Huang, Sheng Zhang, YiCheng Zhang, Yi Zhang, Yifang Yin

Aircraft landing time (ALT) prediction is crucial for air traffic management, especially for arrival aircraft sequencing on the runway.

Paper
Add Code

SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection

1 code implementation • 26 Aug 2023 • Qiu Zhou, Jinming Cao, Hanchao Leng, Yifang Yin, Yu Kun, Roger Zimmermann

This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems.

3D Object Detection Autonomous Driving +2

Paper
Code

Prototypical Cross-domain Knowledge Transfer for Cervical Dysplasia Visual Inspection

no code implementations • 19 Aug 2023 • Yichen Zhang, Yifang Yin, Ying Zhang, Zhenguang Liu, Zheng Wang, Roger Zimmermann

Early detection of dysplasia of the cervix is critical for cervical cancer treatment.

Contrastive Learning Transfer Learning

Paper
Add Code

Beyond Geo-localization: Fine-grained Orientation of Street-view Images by Cross-view Matching with Satellite Imagery with Supplementary Materials

no code implementations • 7 Jul 2023 • Wenmiao Hu, Yichen Zhang, Yuxuan Liang, Yifang Yin, Andrei Georgescu, An Tran, Hannes Kruppa, See-Kiong Ng, Roger Zimmermann

Street-view imagery provides us with novel experiences to explore different places remotely.

Ranked #3 on Image-Based Localization on cvact

Image-Based Localization

Paper
Add Code

Emotionally Enhanced Talking Face Generation

1 code implementation • 21 Mar 2023 • Sahil Goyal, Shagun Uppal, Sarthak Bhagat, Yi Yu, Yifang Yin, Rajiv Ratn Shah

To mitigate this, we build a talking face generation framework conditioned on a categorical emotion to generate videos with appropriate expressions, making them more realistic and convincing.

Ranked #1 on Talking Face Generation on CREMA-D

Talking Face Generation Talking Head Generation

320

Paper
Code

CrossMatch: Source-Free Domain Adaptive Semantic Segmentation via Cross-Modal Consistency Training

no code implementations • ICCV 2023 • Yifang Yin, Wenmiao Hu, Zhenguang Liu, Guanfeng Wang, Shili Xiang, Roger Zimmermann

Source-free domain adaptive semantic segmentation has gained increasing attention recently.

Denoising Pseudo Label +2

Paper
Add Code

Emotional Talking Faces: Making Videos More Expressive and Realistic

no code implementations • ACM Multimedia Asia 2022 • Sahil Goyal, Shagun Uppal, Sarthak Bhagat, Dhroov Goel, Sakshat Mali, Yi Yu, Yifang Yin, Rajiv Ratn Shah

Lip synchronization and talking face generation have gained a specific interest from the research community with the advent and need of digital communication in different fields.

Talking Face Generation

Paper
Add Code

Mix-up Self-Supervised Learning for Contrast-agnostic Applications

no code implementations • 2 Apr 2022 • Yichen Zhang, Yifang Yin, Ying Zhang, Roger Zimmermann

Contrastive self-supervised learning has attracted significant research attention recently.

Image Classification Image Reconstruction +4

Paper
Add Code

Motion Prediction via Joint Dependency Modeling in Phase Space

no code implementations • 7 Jan 2022 • Pengxiang Su, Zhenguang Liu, Shuang Wu, Lei Zhu, Yifang Yin, Xuanjing Shen

In this paper, we introduce a novel convolutional neural model to effectively leverage explicit prior knowledge of motion anatomy, and simultaneously capture both spatial and temporal information of joint trajectory dynamics.

Anatomy motion prediction

Paper
Add Code

Decoupling Long- and Short-Term Patterns in Spatiotemporal Inference

no code implementations • 16 Sep 2021 • Junfeng Hu, Yuxuan Liang, Zhencheng Fan, Li Liu, Yifang Yin, Roger Zimmermann

Specifically, we introduce a joint spatiotemporal graph attention network to learn the relations across space and time for short-term patterns.

Graph Attention

Paper
Add Code

Zero-Shot Multi-View Indoor Localization via Graph Location Networks

1 code implementation • 6 Aug 2020 • Meng-Jiun Chiou, Zhenguang Liu, Yifang Yin, An-An Liu, Roger Zimmermann

In this paper, we propose a novel neural network based architecture Graph Location Networks (GLN) to perform infrastructure-free, multi-view image based indoor localization.

Indoor Localization

Paper
Code

"Notic My Speech" -- Blending Speech Patterns With Multimedia

no code implementations • 12 Jun 2020 • Dhruva Sahrawat, Yaman Kumar, Shashwat Aggarwal, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann

To close the gap between speech understanding and multimedia video applications, in this paper, we show the initial experiments by modelling the perception on visual speech and showing its use case on video compression.

speech-recognition Video Compression +1

Paper
Add Code

COBRA: Contrastive Bi-Modal Representation Algorithm

1 code implementation • 7 May 2020 • Vishaal Udandarao, Abhishek Maiti, Deepak Srivatsav, Suryatej Reddy Vyalla, Yifang Yin, Rajiv Ratn Shah

In this paper, we present a novel framework COBRA that aims to train two modalities (image and text) in a joint fashion inspired by the Contrastive Predictive Coding (CPC) and Noise Contrastive Estimation (NCE) paradigms which preserve both inter and intra-class relationships.

Cross-Modal Retrieval Image Captioning +3

Paper
Code

Lipper: Synthesizing Thy Speech using Multi-View Lipreading

no code implementations • 28 Jun 2019 • Yaman Kumar, Rohit Jain, Khwaja Mohd. Salik, Rajiv Ratn Shah, Yifang Yin, Roger Zimmermann

The model takes silent videos as input and produces speech as the output.

Lipreading

Paper
Add Code

Harnessing GANs for Zero-shot Learning of New Classes in Visual Speech Recognition

1 code implementation • 29 Jan 2019 • Yaman Kumar, Dhruva Sahrawat, Shubham Maheshwari, Debanjan Mahata, Amanda Stent, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann

To solve this problem, we present a novel approach to zero-shot learning by generating new classes using Generative Adversarial Networks (GANs), and show how the addition of unseen class samples increases the accuracy of a VSR system by a significant margin of 27% and allows it to handle speaker-independent out-of-vocabulary phrases.

speech-recognition Visual Speech Recognition +1

Paper
Code

A Multimodal Approach to Predict Social Media Popularity

no code implementations • 16 Jul 2018 • Mayank Meghawat, Satyendra Yadav, Debanjan Mahata, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann

In this work, we propose a multimodal dataset consisiting of content, context, and social information for popularity prediction.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.