no code implementations • 3 Jun 2022 • Juliette Millet, Charlotte Caucheteux, Pierre Orhan, Yves Boubenec, Alexandre Gramfort, Ewan Dunbar, Christophe Pallier, Jean-Remi King
These elements, resulting from the largest neuroimaging benchmark to date, show how self-supervised learning can account for a rich organization of speech processing in the brain, and thus delineate a path to identify the laws of language acquisition which shape the human brain.
no code implementations • CoNLL (EMNLP) 2021 • Juliette Millet, Ioana Chitoran, Ewan Dunbar
Our native language influences the way we perceive speech sounds, affecting our ability to discriminate non-native sounds.
no code implementations • ACL 2022 • Juliette Millet, Ewan Dunbar
We show that the CPC model shows a small native language effect, but that wav2vec 2. 0 and HuBERT seem to develop a universal speech perception space which is not language specific.
no code implementations • 25 Feb 2021 • Juliette Millet, Jean-Remi King
Third, learning to process phonetically-related speech inputs (i. e., Dutch vs English) leads deep nets to reach higher levels of brain-similarity than learning to process phonetically-distant speech inputs (i. e. Dutch vs Bengali).
1 code implementation • 12 Oct 2020 • Juliette Millet, Ewan Dunbar
In this paper, we present a data set and methods to compare speech processing models and human behaviour on a phone discrimination task.
no code implementations • 7 May 2020 • Juliette Millet, Ewan Dunbar
We show that DeepSpeech, a standard English speech recognizer, is more specialized on English phoneme discrimination than English listeners, and is poorly correlated with their behaviour, even though it yields a low error on the decision task given to humans.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
3 code implementations • 27 Nov 2018 • Juliette Millet, Neil Zeghidour
We extend this approach to paralinguistic classification and propose a neural network that can learn a filterbank, a normalization factor and a compression power from the raw speech, jointly with the rest of the architecture.