Search Results for author: Daniel Stoller

Found 10 papers, 8 papers with code

LLark: A Multimodal Instruction-Following Language Model for Music

1 code implementation11 Oct 2023 Josh Gardner, Simon Durand, Daniel Stoller, Rachel M. Bittner

Music has a unique and complex structure which is challenging for both expert humans and existing AI systems to understand, and presents unique challenges relative to other forms of audio.

Instruction Following Language Modelling

Contrastive Learning-Based Audio to Lyrics Alignment for Multiple Languages

1 code implementation13 Jun 2023 Simon Durand, Daniel Stoller, Sebastian Ewert

This way, we obtain a novel system that is simple to train end-to-end, can make use of weakly annotated training data, jointly learns a powerful text model, and is tailored to alignment.

Contrastive Learning speech-recognition +1

Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling

1 code implementation14 Nov 2019 Daniel Stoller, Mi Tian, Sebastian Ewert, Simon Dixon

In comparison to TCN and Wavenet, our network consistently saves memory and computation time, with speed-ups for training and inference of over 4x in the audio generation experiment in particular, while achieving a comparable performance in all tasks.

Audio Generation Causal Language Modeling +2

Training Generative Adversarial Networks from Incomplete Observations using Factorised Discriminators

1 code implementation ICLR 2020 Daniel Stoller, Sebastian Ewert, Simon Dixon

We apply our method to image generation, image segmentation and audio source separation, and obtain improved performance over a standard GAN when additional incomplete training examples are available.

Audio Source Separation Image Generation +3

GAN-based Generation and Automatic Selection of Explanations for Neural Networks

no code implementations21 Apr 2019 Saumitra Mishra, Daniel Stoller, Emmanouil Benetos, Bob L. Sturm, Simon Dixon

However, this requires a careful selection of hyper-parameters to generate interpretable examples for each neuron of interest, and current methods rely on a manual, qualitative evaluation of each setting, which is prohibitively slow.

Ensemble Models for Spoofing Detection in Automatic Speaker Verification

1 code implementation9 Apr 2019 Bhusan Chettri, Daniel Stoller, Veronica Morfi, Marco A. Martínez Ramírez, Emmanouil Benetos, Bob L. Sturm

Our ensemble model outperforms all our single models and the baselines from the challenge for both attack types.

Audio and Speech Processing Sound

End-to-end Lyrics Alignment for Polyphonic Music Using an Audio-to-Character Recognition Model

2 code implementations18 Feb 2019 Daniel Stoller, Simon Durand, Sebastian Ewert

Time-aligned lyrics can enrich the music listening experience by enabling karaoke, text-based song retrieval and intra-song navigation, and other applications.

Retrieval

Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation

9 code implementations8 Jun 2018 Daniel Stoller, Sebastian Ewert, Simon Dixon

Models for audio source separation usually operate on the magnitude spectrum, which ignores phase information and makes separation performance dependant on hyper-parameters for the spectral front-end.

Audio Source Separation Music Source Separation

Adversarial Semi-Supervised Audio Source Separation applied to Singing Voice Extraction

3 code implementations31 Oct 2017 Daniel Stoller, Sebastian Ewert, Simon Dixon

Based on this idea, we drive the separator towards outputs deemed as realistic by discriminator networks that are trained to tell apart real from separator samples.

Audio Source Separation Data Augmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.