no code implementations • 23 Aug 2023 • Purnima Kamath, Chitralekha Gupta, Lonce Wyse, Suranga Nanayakkara
By using a few synthetic examples to indicate the presence or absence of a semantic attribute, we infer the guidance vectors in the latent space of the StyleGAN to control that attribute during generation.
no code implementations • 23 Apr 2023 • Chitralekha Gupta, Purnima Kamath, Yize Wei, Zhuoyao Li, Suranga Nanayakkara, Lonce Wyse
In this paper, we propose a data-driven approach to train a Generative Adversarial Network (GAN) conditioned on "soft-labels" distilled from the penultimate layer of an audio classifier trained on a target set of audio texture classes.
no code implementations • 23 Aug 2022 • Chitralekha Gupta, Yize Wei, Zequn Gong, Purnima Kamath, Zhuoyao Li, Lonce Wyse
These metrics use deep features that summarize the statistics of any given audio texture, thus being inherently sensitive to variations in the statistical parameters that define an audio texture.
no code implementations • 27 Jun 2022 • Lonce Wyse, Purnima Kamath, Chitralekha Gupta
We introduce a new system for data-driven audio sound model design built around two different neural network architectures, a Generative Adversarial Network(GAN) and a Recurrent Neural Network (RNN), that takes advantage of the unique characteristics of each to achieve the system objectives that neither is capable of addressing alone.
no code implementations • 29 Sep 2021 • Lonce Wyse, Purnima Kamath, Chitralekha Gupta
We introduce a new system for data-driven audio sound model design built around two different neural network architectures, a Generative Adversarial Network(GAN) and a Recurrent Neural Network (RNN), that takes advantage of the unique characteristics of each to achieve the system objectives that neither is capable of addressing alone.
no code implementations • 12 Mar 2021 • Chitralekha Gupta, Purnima Kamath, Lonce Wyse
Generative Adversarial Networks (GANs) currently achieve the state-of-the-art sound synthesis quality for pitched musical instruments using a 2-channel spectrogram representation consisting of log magnitude and instantaneous frequency (the "IFSpectrogram").