2 code implementations • 14 Aug 2023 • Giorgio Fabbro, Stefan Uhlich, Chieh-Hsin Lai, Woosung Choi, Marco Martínez-Ramírez, WeiHsiang Liao, Igor Gadelha, Geraldo Ramos, Eddie Hsu, Hugo Rodrigues, Fabian-Robert Stöter, Alexandre Défossez, Yi Luo, Jianwei Yu, Dipam Chakraborty, Sharada Mohanty, Roman Solovyev, Alexander Stempkovskiy, Tatiana Habruseva, Nabarun Goswami, Tatsuya Harada, Minseok Kim, Jun Hyung Lee, Yuanliang Dong, Xinran Zhang, Jiafeng Liu, Yuki Mitsufuji
We propose a formalization of the errors that can occur in the design of a training dataset for MSS systems and introduce two new datasets that simulate such errors: SDXDB23_LabelNoise and SDXDB23_Bleeding1.
1 code implementation • 14 Aug 2023 • Stefan Uhlich, Giorgio Fabbro, Masato Hirano, Shusuke Takahashi, Gordon Wichern, Jonathan Le Roux, Dipam Chakraborty, Sharada Mohanty, Kai Li, Yi Luo, Jianwei Yu, Rongzhi Gu, Roman Solovyev, Alexander Stempkovskiy, Tatiana Habruseva, Mikhail Sukhovei, Yuki Mitsufuji
A significant source of this improvement was making the simulated data better match real cinematic audio, which we further investigate in detail.
1 code implementation • 13 May 2023 • Ryosuke Sawata, Naoya Takahashi, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji
We modify the target network, i. e., the network architecture of the original DNN-based MSS, by adding bridging paths for each output instrument to share their information.
1 code implementation • 13 Dec 2022 • Yassir Bendou, Vincent Gripon, Bastien Pasdeloup, Lukas Mauch, Stefan Uhlich, Fabien Cardinaux, Ghouthi Boukli Hacene, Javier Alonso Garcia
Such a set is hardly available in few-shot learning scenarios, a highly disregarded shortcoming in the field.
1 code implementation • 4 Nov 2022 • Junghyun Koo, Marco A. Martínez-Ramírez, Wei-Hsiang Liao, Stefan Uhlich, Kyogu Lee, Yuki Mitsufuji
We propose an end-to-end music mixing style transfer system that converts the mixing style of an input multitrack to that of a reference song.
1 code implementation • 24 Aug 2022 • Marco A. Martínez-Ramírez, Wei-Hsiang Liao, Giorgio Fabbro, Stefan Uhlich, Chihiro Nagashima, Yuki Mitsufuji
Music mixing traditionally involves recording instruments in the form of clean, individual tracks and blending them into a final mixture using audio effects and expert knowledge (e. g., a mixing engineer).
no code implementations • 21 Mar 2022 • Bac Nguyen, Fabien Cardinaux, Stefan Uhlich
Using this differentiable duration method, we introduce AutoTTS, a direct text-to-waveform speech synthesis model.
no code implementations • 3 Feb 2022 • Johannes Imort, Giorgio Fabbro, Marco A. Martínez Ramírez, Stefan Uhlich, Yuichiro Koyama, Yuki Mitsufuji
Given the recent advances in music source separation and automatic mixing, removing audio effects in music tracks is a meaningful step toward developing an automated remixing system.
no code implementations • 8 Oct 2021 • Ali Aroudi, Stefan Uhlich, Marc Ferras Font
We evaluate the network on a realistic and challenging reverberant dataset, generated from measured room impulse responses of an actual microphone array.
1 code implementation • 31 Aug 2021 • Yuki Mitsufuji, Giorgio Fabbro, Stefan Uhlich, Fabian-Robert Stöter, Alexandre Défossez, Minseok Kim, Woosung Choi, Chin-Yun Yu, Kin-Wai Cheuk
The main differences compared with the past challenges are 1) the competition is designed to more easily allow machine learning practitioners from other disciplines to participate, 2) evaluation is done on a hidden test set created by music professionals dedicated exclusively to the challenge to assure the transparency of the challenge, i. e., the test set is not accessible from anyone except the challenge organizers, and 3) the dataset provides a wider range of music genres and involved a greater number of mixing engineers.
no code implementations • 26 May 2021 • Koichi Saito, Stefan Uhlich, Giorgio Fabbro, Yuki Mitsufuji
Furthermore, we propose a noise augmentation scheme for mixture-invariant training (MixIT), which allows using it also in such scenarios.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 24 Mar 2021 • Ghouthi Boukli Hacene, Lukas Mauch, Stefan Uhlich, Fabien Cardinaux
We call this procedure \textit{DNN Quantization with Attention} (DQA).
1 code implementation • 12 Feb 2021 • Takuya Narihira, Javier Alonsogarcia, Fabien Cardinaux, Akio Hayakawa, Masato Ishii, Kazunori Iwaki, Thomas Kemp, Yoshiyuki Kobayashi, Lukas Mauch, Akira Nakamura, Yukio Obuchi, Andrew Shin, Kenji Suzuki, Stephen Tiedmann, Stefan Uhlich, Takuya Yashima, Kazuki Yoshiyama
While there exist a plethora of deep learning tools and frameworks, the fast-growing complexity of the field brings new demands and challenges, such as more flexible network design, speedy computation on distributed setting, and compatibility between different tools.
5 code implementations • 8 Oct 2020 • Ryosuke Sawata, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji
This paper proposes several improvements for music separation with deep neural networks (DNNs), namely a multi-domain loss (MDL) and two combination schemes.
Ranked #20 on Music Source Separation on MUSDB18
no code implementations • 15 May 2020 • Mohammad Asif Khan, Fabien Cardinaux, Stefan Uhlich, Marc Ferras, Asja Fischer
This procedure bears the problem that the generated magnitude spectrogram may not be consistent, which is required for finding a phase such that the full spectrogram has a natural-sounding speech waveform.
no code implementations • NIPS Workshop CDNNRIA 2018 • Fabien Cardinaux, Stefan Uhlich, Kazuki Yoshiyama, Javier Alonso Garcia, Lukas Mauch, Stephen Tiedemann, Thomas Kemp, Akira Nakamura
For each layer, we learn a value dictionary and an assignment matrix to represent the network weights.
1 code implementation • The Journal of Open Source Software 2019 • Fabian-Robert Stöter, Stefan Uhlich, Antoine Liutkus, and YukiMitsufuji
Music source separation is the task of decomposing music into its constitutive components, e. g., yielding separated stems for the vocals, bass, and drums.
Ranked #14 on Music Source Separation on MUSDB18 (using extra training data)
2 code implementations • ICLR 2020 • Stefan Uhlich, Lukas Mauch, Fabien Cardinaux, Kazuki Yoshiyama, Javier Alonso Garcia, Stephen Tiedemann, Thomas Kemp, Akira Nakamura
Since choosing the optimal bitwidths is not straight forward, training methods, which can learn them, are desirable.
no code implementations • 13 Nov 2018 • Fabien Cardinaux, Stefan Uhlich, Kazuki Yoshiyama, Javier Alonso García, Stephen Tiedemann, Thomas Kemp, Akira Nakamura
In this paper we introduce a training method, called look-up table quantization, LUT-Q, which learns a dictionary and assigns each weight to one of the dictionary's values.
1 code implementation • 7 Jul 2018 • Joachim Muth, Stefan Uhlich, Nathanael Perraudin, Thomas Kemp, Fabien Cardinaux, Yuki Mitsufuji
Music source separation with deep neural networks typically relies only on amplitude features.