Audio Classification

128 papers with code • 23 benchmarks • 33 datasets

Audio Classification is a machine learning task that involves identifying and tagging audio signals into different classes or categories. The goal of audio classification is to enable machines to automatically recognize and distinguish between different types of audio, such as music, speech, and environmental sounds.

Libraries

Use these libraries to find Audio Classification models and implementations
3 papers
21
2 papers
2,941
See all 6 libraries.

Latest papers with no code

Mixer is more than just a model

no code yet • 28 Feb 2024

In the field of computer vision, MLP-Mixer is noted for its ability to extract data information from both channel and token perspectives, effectively acting as a fusion of channel and token information.

Tuning In: Analysis of Audio Classifier Performance in Clinical Settings with Limited Data

no code yet • 7 Feb 2024

This study assesses deep learning models for audio classification in a clinical setting with the constraint of small datasets reflecting real-world prospective data collection.

On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification

no code yet • 2 Feb 2024

In recent years, self-supervised learning has excelled for its capacity to learn robust feature representations from unlabelled data.

From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers

no code yet • 16 Jan 2024

We introduce multi-phase training of audio spectrogram transformers by connecting the seminal idea of coarse-to-fine with transformer models.

Class-Incremental Learning for Multi-Label Audio Classification

no code yet • 9 Jan 2024

Experiments are performed on a dataset with 50 sound classes, with an initial classification task containing 30 base classes and 4 incremental phases of 5 classes each.

CLAPP: Contrastive Language-Audio Pre-training in Passive Underwater Vessel Classification

no code yet • 4 Jan 2024

Existing research on audio classification faces challenges in recognizing attributes of passive underwater vessel scenarios and lacks well-annotated datasets due to data privacy concerns.

On the choice of the optimal temporal support for audio classification with Pre-trained embeddings

no code yet • 21 Dec 2023

Choosing the best one for a set of tasks is the subject of many recent publications.

Formal Verification of Long Short-Term Memory based Audio Classifiers: A Star based Approach

no code yet • 16 Nov 2023

Formally verifying audio classification systems is essential to ensure accurate signal classification across real-world applications like surveillance, automotive voice commands, and multimedia content management, preventing potential errors with serious consequences.

Pruning random resistive memory for optimizing analogue AI

no code yet • 13 Nov 2023

Here, we report a universal solution, software-hardware co-design using structural plasticity-inspired edge pruning to optimize the topology of a randomly weighted analogue resistive memory neural network.

Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities

no code yet • 9 Nov 2023

We propose a multimodal model, called Mirasol3B, consisting of an autoregressive component for the time-synchronized modalities (audio and video), and an autoregressive component for the context modalities which are not necessarily aligned in time but are still sequential.