26 papers with code • 3 benchmarks • 13 datasets
Sound Event Detection (SED) is the task of recognizing the sound events and their respective temporal start and end time in a recording. Sound events in real life do not always occur in isolation, but tend to considerably overlap with each other. Recognizing such overlapping sound events is referred as polyphonic SED.
Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal.
Ranked #2 on Robust classification on CIFAR-10
To foster the investigation of label noise in sound event classification we present FSDnoisy18k, a dataset containing 42. 5 hours of audio across 20 sound classes, including a small amount of manually-labeled data and a larger quantity of real-world noisy data.
In this work, we treat SED as a multiple instance learning (MIL) problem, where training labels are static over a short excerpt, indicating the presence or absence of sound sources but not their temporal locality.
As a case study, we consider the problem of detecting avian flight calls from a ten-hour recording of nocturnal bird migration, recorded by a network of six ARUs in the presence of heterogeneous background noise.
This paper presents Task 4 of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 challenge and provides a first analysis of the challenge results.
Ranked #3 on Sound Event Detection on DESED
In this work, we first describe a CNN based approach for weakly supervised training of audio events.
The number of the channels of the CNNs and size of the weight matrices of the RNNs have a direct effect on the total amount of parameters of the SED method, which is to a couple of millions.
Single task deep neural networks that perform a target task among diverse cross-related tasks in the acoustic scene and event literature are being developed.