Acoustic Scene Classification
37 papers with code • 5 benchmarks • 10 datasets
The goal of acoustic scene classification is to classify a test recording into one of the provided predefined classes that characterizes the environment in which it was recorded.
Source: DCASE 2019 Source: DCASE 2018
Datasets
Most implemented papers
Low-Complexity Models for Acoustic Scene Classification Based on Receptive Field Regularization and Frequency Damping
Deep Neural Networks are known to be very demanding in terms of computing and memory requirements.
Spectrum Correction: Acoustic Scene Classification with Mismatched Recording Devices
This method works for both time and frequency domain representations of audio recordings.
Receptive Field Regularization Techniques for Audio Classification and Tagging with Deep Convolutional Neural Networks
As state-of-the-art CNN architectures-in computer vision and other domains-tend to go deeper in terms of number of layers, their RF size increases and therefore they degrade in performance in several audio classification and tagging tasks.
Low-complexity acoustic scene classification for multi-device audio: analysis of DCASE 2021 Challenge systems
The most used techniques among the submissions were residual networks and weight quantization, with the top systems reaching over 70% accuracy, and log loss under 0. 8.
A Variational Bayesian Approach to Learning Latent Variables for Acoustic Knowledge Transfer
We propose a variational Bayesian (VB) approach to learning distributions of latent variables in deep neural network (DNN) models for cross-domain knowledge transfer, to address acoustic mismatches between training and testing conditions.
Towards Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning
The deployment of machine listening algorithms in real-life applications is often impeded by a domain shift caused for instance by different microphone characteristics.
Acoustic scene classification using auditory datasets
The approach used not only challenges some of the fundamental mathematical techniques used so far in early experiments of the same trend but also introduces new scopes and new horizons for interesting results.
A Passive Similarity based CNN Filter Pruning for Efficient Acoustic Scene Classification
We propose a passive filter pruning framework, where a few convolutional filters from the CNNs are eliminated to yield compressed CNNs.
Multi-dimensional Edge-based Audio Event Relational Graph Representation Learning for Acoustic Scene Classification
Experiments on a polyphonic acoustic scene dataset show that the proposed ERGL achieves competitive performance on ASC by using only a limited number of embeddings of audio events without any data augmentations.
Efficient Similarity-based Passive Filter Pruning for Compressing CNNs
However, the computational complexity of computing the pairwise similarity matrix is high, particularly when a convolutional layer has many filters.