Sound Event Detection
74 papers with code • 4 benchmarks • 18 datasets
Sound Event Detection (SED) is the task of recognizing the sound events and their respective temporal start and end time in a recording. Sound events in real life do not always occur in isolation, but tend to considerably overlap with each other. Recognizing such overlapping sound events is referred as polyphonic SED.
Source: A report on sound event detection with different binaural features
Libraries
Use these libraries to find Sound Event Detection models and implementationsDatasets
Latest papers with no code
Semi-supervised Sound Event Detection with Local and Global Consistency Regularization
Then, the local consistency is adopted to encourage the model to leverage local features for frame-level predictions, and the global consistency is applied to force features to align with global prototypes through a specially designed contrastive loss.
Furnishing Sound Event Detection with Language Model Abilities
Recently, the ability of language models (LMs) has attracted increasing attention in visual cross-modality.
DiffSED: Sound Event Detection with Denoising Diffusion
In this work, we reformulate the SED problem by taking a generative learning perspective.
Auditory Neural Response Inspired Sound Event Detection Based on Spectro-temporal Receptive Field
In this work, we utilized STRF as a kernel of the first convolutional layer in SED model to extract neural response from input sound to make SED model similar to human auditory system.
Channel-Spatial-Based Few-Shot Bird Sound Event Detection
In this paper, we propose a model for bird sound event detection that focuses on a small number of training samples within the everyday long-tail distribution.
Semi-supervsied Learning-based Sound Event Detection using Freuqency Dynamic Convolution with Large Kernel Attention for DCASE Challenge 2023 Task 4
The proposed FDY with LKA-CRNN with a BEATs embedding network is initially trained on the entire DCASE 2023 Task 4 dataset using the mean-teacher approach, generating pseudo-labels for weakly labeled, unlabeled, and the AudioSet.
Divided spectro-temporal attention for sound event localization and detection in real scenes for DCASE2023 challenge
Localizing sounds and detecting events in different room environments is a difficult task, mainly due to the wide range of reflections and reverberations.
A Multi-Task Learning Framework for Sound Event Detection using High-level Acoustic Characteristics of Sounds
Sound event detection (SED) entails identifying the type of sound and estimating its temporal boundaries from acoustic signals.
Learning to Detect Novel and Fine-Grained Acoustic Sequences Using Pretrained Audio Representations
This work investigates pretrained audio representations for few shot Sound Event Detection.
Leveraging Audio-Tagging Assisted Sound Event Detection using Weakified Strong Labels and Frequency Dynamic Convolutions
Stage-1 of our proposed framework focuses on audio-tagging (AT), which assists the sound event detection (SED) system in Stage-2.