no code implementations • 21 Mar 2024 • Shrishail Baligar, Mikolaj Kegler, Bryce Irvin, Marko Stamenovic, Shawn Newsam
First, we explore the utility of context by providing the TSE model with oracle information about what sound classes make up the input mixture, where the objective of the model is to extract one or more sources of interest indicated by the user.
1 code implementation • 18 Mar 2024 • Tornike Karchkhadze, Hassan Salami Kavaki, Mohammad Rasool Izadi, Bryce Irvin, Mikolaj Kegler, Ari Hertz, Shuo Zhang, Marko Stamenovic
We introduce a new loss term to enhance Foley sound generation in AudioLDM without post-filtering.
1 code implementation • 4 Nov 2022 • Bryce Irvin, Marko Stamenovic, Mikolaj Kegler, Li-Chia Yang
Modern speech enhancement (SE) networks typically implement noise suppression through time-frequency masking, latent representation masking, or discriminative signal prediction.