1 code implementation • 14 Mar 2024 • Ilyass Moummad, Nicolas Farrugia, Romain Serizel, Jeremy Froidevaux, Vincent Lostanlen
Multi-label imbalanced classification poses a significant challenge in machine learning, particularly evident in bioacoustics where animal sounds often co-occur, and certain sounds are much less frequent than others.
1 code implementation • 11 Sep 2023 • Daniel Haider, Vincent Lostanlen, Martin Ehler, Peter Balazs
Numerical simulations align with our theory and suggest that the condition number of a convolutional layer follows a logarithmic scaling law between the number and length of the filters, which is reminiscent of discrete wavelet bases.
2 code implementations • 25 Jul 2023 • Vincent Lostanlen, Daniel Haider, Han Han, Mathieu Lagrange, Peter Balazs, Martin Ehler
Waveform-based deep learning faces a dilemma between nonparametric and parametric approaches.
1 code implementation • 15 Jun 2023 • Ines Nolasco, Burooj Ghani, Shubhr Singh, Ester Vidaña-Vila, Helen Whitehead, Emily Grout, Michael Emmerson, Frants Jensen, Ivan Kiskin, Joe Morford, Ariana Strandburg-Peshkin, Lisa Gill, Hanna Pamuła, Vincent Lostanlen, Dan Stowell
Few-shot bioacoustic event detection consists in detecting sound events of specified types, in varying soundscapes, while having access to only a few examples of the class of interest.
1 code implementation • 24 Jan 2023 • Cyrus Vahidi, Han Han, Changhong Wang, Mathieu Lagrange, György Fazekas, Vincent Lostanlen
Computer musicians refer to mesostructures as the intermediate levels of articulation between the microstructure of waveshapes and the macrostructure of musical forms.
1 code implementation • 7 Jan 2023 • Han Han, Vincent Lostanlen, Mathieu Lagrange
On the other hand, mean square error in the spectrotemporal domain, known as "spectral loss", is perceptually motivated and serves in differentiable digital signal processing (DDSP).
3 code implementations • 18 Apr 2022 • John Muradeli, Cyrus Vahidi, Changhong Wang, Han Han, Vincent Lostanlen, Mathieu Lagrange, George Fazekas
Joint time-frequency scattering (JTFS) is a convolutional operator in the time-frequency domain which extracts spectrotemporal modulations at various rates and scales.
no code implementations • 19 Sep 2020 • Christopher Ick, Vincent Lostanlen
Deep convolutional networks (convnets) show a remarkable ability to learn disentangled representations.
no code implementations • 11 Sep 2020 • Mark Cartwright, Jason Cramer, Ana Elisa Mendez Mendez, Yu Wang, Ho-Hsiang Wu, Vincent Lostanlen, Magdalena Fuentes, Graham Dove, Charlie Mydlarz, Justin Salamon, Oded Nov, Juan Pablo Bello
In this article, we describe our data collection procedure and propose evaluation metrics for multilabel classification of urban sound tags.
1 code implementation • 21 Jul 2020 • Vincent Lostanlen, Christian El-Hajj, Mathias Rossignol, Grégoire Lafay, Joakim andén, Mathieu Lagrange
Furthermore, it minimizes triplet loss in the cluster graph by means of the large-margin nearest neighbor (LMNN) metric learning algorithm.
1 code implementation • 20 Jul 2020 • Han Han, Vincent Lostanlen
Disentangling and recovering physical attributes, such as shape and material, from a few waveform examples is a challenging inverse problem in audio signal processing, with numerous applications in musical acoustics as well as structural engineering.
no code implementations • 2 Mar 2020 • Vincent Lostanlen, Alice Cohen-Hadria, Juan Pablo Bello
With the aim of constructing a biologically plausible model of machine listening, we study the representation of a multicomponent stationary signal by a wavelet scattering network.
no code implementations • 1 Nov 2019 • Vincent Lostanlen, Kaitlin Palmer, Elly Knight, Christopher Clark, Holger Klinck, Andrew Farnsworth, Tina Wong, Jason Cramer, Juan Pablo Bello
This paper proposes to perform unsupervised detection of bioacoustic events by pooling the magnitudes of spectrogram frames after per-channel energy normalization (PCEN).
1 code implementation • 22 Oct 2019 • Vincent Lostanlen, Sripathi Sridhar, Brian McFee, Andrew Farnsworth, Juan Pablo Bello
To explain the consonance of octaves, music psychologists represent pitch as a helix where azimuth and axial coordinate correspond to pitch class and pitch height respectively.
1 code implementation • 20 May 2019 • Vincent Lostanlen, Justin Salamon, Andrew Farnsworth, Steve Kelling, Juan Pablo Bello
As a case study, we consider the problem of detecting avian flight calls from a ten-hour recording of nocturnal bird migration, recorded by a network of six ARUs in the presence of heterogeneous background noise.
2 code implementations • 28 Dec 2018 • Mathieu Andreux, Tomás Angles, Georgios Exarchakis, Roberto Leonarduzzi, Gaspar Rochette, Louis Thiry, John Zarka, Stéphane Mallat, Joakim andén, Eugene Belilovsky, Joan Bruna, Vincent Lostanlen, Muawiz Chaudhary, Matthew J. Hirn, Edouard Oyallon, Sixin Zhang, Carmine Cella, Michael Eickenberg
The wavelet scattering transform is an invariant signal representation suitable for many signal processing and machine learning applications.
1 code implementation • 1 Oct 2018 • Vincent Lostanlen
We introduce a new multidimensional representation, named eigenprogression transform, that characterizes some essential patterns of Western tonal harmony while being equivariant to time shifts and pitch transpositions.
Sound Audio and Speech Processing
1 code implementation • 3 Jan 2016 • Vincent Lostanlen, Stéphane Mallat
We present a new representation of harmonic sounds that linearizes the dynamics of pitch and spectral envelope, while remaining stable to deformations in the time-frequency plane.