no code implementations • 18 Sep 2023 • Hsuan Su, Ting-yao Hu, Hema Swetha Koppula, Raviteja Vemulapalli, Jen-Hao Rick Chang, Karren Yang, Gautam Varma Mantena, Oncel Tuzel
In this paper, we propose a new strategy for adapting ASR models to new target domains without any text or speech from those domains.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 27 Mar 2023 • Karren Yang, Ting-yao Hu, Jen-Hao Rick Chang, Hema Swetha Koppula, Oncel Tuzel
Here, we ask two fundamental questions about this strategy: when is synthetic data effective for personalization, and why is it effective in those cases?
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 6 Oct 2021 • Jen-Hao Rick Chang, Ashish Shrivastava, Hema Swetha Koppula, Xiaoshuai Zhang, Oncel Tuzel
However, under an unsupervised-style setting, typical training algorithms for controllable sequence generative models suffer from the training-inference mismatch, where the same sample is used as content and style input during training but unpaired samples are given during inference.
no code implementations • 4 Oct 2012 • Hema Swetha Koppula, Rudhir Gupta, Ashutosh Saxena
Given a RGB-D video, we jointly model the human activities and object affordances as a Markov random field where the nodes represent objects and sub-activities, and the edges represent the relationships between object affordances, their relations with sub-activities, and their evolution over time.
Ranked #3 on Skeleton Based Action Recognition on CAD-120