no code implementations • 23 Aug 2023 • Purnima Kamath, Chitralekha Gupta, Lonce Wyse, Suranga Nanayakkara
By using a few synthetic examples to indicate the presence or absence of a semantic attribute, we infer the guidance vectors in the latent space of the StyleGAN to control that attribute during generation.
no code implementations • 6 Jun 2023 • Elliott Wen, Chitralekha Gupta, Prasanth Sasikumar, Mark Billinghurst, James Wilmott, Emily Skow, Arindam Dey, Suranga Nanayakkara
Researchers have used machine learning approaches to identify motion sickness in VR experience.
no code implementations • 23 Apr 2023 • Chitralekha Gupta, Purnima Kamath, Yize Wei, Zhuoyao Li, Suranga Nanayakkara, Lonce Wyse
In this paper, we propose a data-driven approach to train a Generative Adversarial Network (GAN) conditioned on "soft-labels" distilled from the penultimate layer of an audio classifier trained on a target set of audio texture classes.
no code implementations • 23 Aug 2022 • Chitralekha Gupta, Yize Wei, Zequn Gong, Purnima Kamath, Zhuoyao Li, Lonce Wyse
These metrics use deep features that summarize the statistics of any given audio texture, thus being inherently sensitive to variations in the statistical parameters that define an audio texture.
no code implementations • 15 Jul 2022 • Xiaoxue Gao, Chitralekha Gupta, Haizhou Li
Lyrics transcription of polyphonic music is challenging as the background music affects lyrics intelligibility.
no code implementations • 27 Jun 2022 • Lonce Wyse, Purnima Kamath, Chitralekha Gupta
We introduce a new system for data-driven audio sound model design built around two different neural network architectures, a Generative Adversarial Network(GAN) and a Recurrent Neural Network (RNN), that takes advantage of the unique characteristics of each to achieve the system objectives that neither is capable of addressing alone.
1 code implementation • 7 Apr 2022 • Xiaoxue Gao, Chitralekha Gupta, Haizhou Li
To improve the robustness of lyrics transcription to the background music, we propose a strategy of combining the features that emphasize the singing vocals, i. e. music-removed features that represent singing vocal extracted features, and the features that capture the singing vocals as well as the background music, i. e. music-present features.
no code implementations • 7 Apr 2022 • Xiaoxue Gao, Chitralekha Gupta, Haizhou Li
Lyrics transcription of polyphonic music is challenging not only because the singing vocals are corrupted by the background music, but also because the background music and the singing style vary across music genres, such as pop, metal, and hip hop, which affects lyrics intelligibility of the song in different ways.
no code implementations • 29 Sep 2021 • Lonce Wyse, Purnima Kamath, Chitralekha Gupta
We introduce a new system for data-driven audio sound model design built around two different neural network architectures, a Generative Adversarial Network(GAN) and a Recurrent Neural Network (RNN), that takes advantage of the unique characteristics of each to achieve the system objectives that neither is capable of addressing alone.
no code implementations • 12 Mar 2021 • Chitralekha Gupta, Purnima Kamath, Lonce Wyse
Generative Adversarial Networks (GANs) currently achieve the state-of-the-art sound synthesis quality for pitched musical instruments using a 2-channel spectrogram representation consisting of log magnitude and instantaneous frequency (the "IFSpectrogram").
no code implementations • 23 Sep 2019 • Chitralekha Gupta, Emre Yilmaz, Haizhou Li
Automatic lyrics alignment and transcription in polyphonic music are challenging tasks because the singing vocals are corrupted by the background music.
Audio and Speech Processing Sound
no code implementations • 25 Jun 2019 • Chitralekha Gupta, Emre Yilmaz, Haizhou Li
In this work, we propose (1) using additional speech and music-informed features and (2) adapting the acoustic models trained on a large amount of solo singing vocals towards polyphonic music using a small amount of in-domain data.