1 code implementation • 24 Oct 2023 • Florian Schmid, Khaled Koutini, Gerhard Widmer
Audio Spectrogram Transformers are excellent at exploiting large datasets, creating powerful pre-trained models that surpass CNNs when fine-tuned on downstream tasks.
Ranked #1 on Instrument Recognition on OpenMIC-2018 (using extra training data)
1 code implementation • 12 May 2023 • Tobias Morocutti, Florian Schmid, Khaled Koutini, Gerhard Widmer
However, we also show that DIR augmentation and Freq-MixStyle are complementary, achieving a new state-of-the-art performance on signals recorded by devices unseen during training.
1 code implementation • 25 Nov 2022 • Khaled Koutini, Shahed Masoudian, Florian Schmid, Hamid Eghbal-zadeh, Jan Schlüter, Gerhard Widmer
Furthermore, we will show that transformers trained on Audioset can be extremely effective representation extractors for a wide range of downstream tasks.
2 code implementations • 9 Nov 2022 • Florian Schmid, Khaled Koutini, Gerhard Widmer
We provide models of different complexity levels, scaling from low-complexity models up to a new state-of-the-art performance of . 483 mAP on AudioSet.
Ranked #2 on Audio Tagging on AudioSet (using extra training data)