Search Results for author: Eugene Kharitonov

Found 23 papers, 13 papers with code

MAD Speech: Measures of Acoustic Diversity of Speech

no code implementations • 16 Apr 2024 • Matthieu Futeral, Andrea Agostinelli, Marco Tagliasacchi, Neil Zeghidour, Eugene Kharitonov

Using these datasets, we demonstrate that our proposed metrics achieve a stronger agreement with the ground-truth diversity than baselines.

Paper
Add Code

AudioPaLM: A Large Language Model That Can Speak and Listen

no code implementations • 22 Jun 2023 • Paul K. Rubenstein, Chulayuth Asawaroengchai, Duc Dung Nguyen, Ankur Bapna, Zalán Borsos, Félix de Chaumont Quitry, Peter Chen, Dalia El Badawy, Wei Han, Eugene Kharitonov, Hannah Muckenhirn, Dirk Padfield, James Qin, Danny Rozenberg, Tara Sainath, Johan Schalkwyk, Matt Sharifi, Michelle Tadmor, Ramanovich, Marco Tagliasacchi, Alexandru Tudor, Mihajlo Velimirović, Damien Vincent, Jiahui Yu, Yongqiang Wang, Vicky Zayats, Neil Zeghidour, Yu Zhang, Zhishuai Zhang, Lukas Zilka, Christian Frank

AudioPaLM inherits the capability to preserve paralinguistic information such as speaker identity and intonation from AudioLM and the linguistic knowledge present only in text large language models such as PaLM-2.

Language Modelling Large Language Model +5

Paper
Add Code

Long-term Effects of Temperature Variations on Economic Growth: A Machine Learning Approach

no code implementations • 17 Jun 2023 • Eugene Kharitonov, Oksana Zakharchuk, Lin Mei

This study investigates the long-term effects of temperature variations on economic growth using a data-driven approach.

Paper
Add Code

SoundStorm: Efficient Parallel Audio Generation

1 code implementation • 16 May 2023 • Zalán Borsos, Matt Sharifi, Damien Vincent, Eugene Kharitonov, Neil Zeghidour, Marco Tagliasacchi

We present SoundStorm, a model for efficient, non-autoregressive audio generation.

Audio Generation

1,117

Paper
Code

AudioLM: a Language Modeling Approach to Audio Generation

5 code implementations • 7 Sep 2022 • Zalán Borsos, Raphaël Marinier, Damien Vincent, Eugene Kharitonov, Olivier Pietquin, Matt Sharifi, Dominik Roblek, Olivier Teboul, David Grangier, Marco Tagliasacchi, Neil Zeghidour

We introduce AudioLM, a framework for high-quality audio generation with long-term consistency.

Audio Generation Language Modelling

32,670

Paper
Code

Generative Spoken Dialogue Language Modeling

no code implementations • 30 Mar 2022 • Tu Anh Nguyen, Eugene Kharitonov, Jade Copet, Yossi Adi, Wei-Ning Hsu, Ali Elkahky, Paden Tomasello, Robin Algayres, Benoit Sagot, Abdelrahman Mohamed, Emmanuel Dupoux

We introduce dGSLM, the first "textless" model able to generate audio samples of naturalistic spoken dialogues.

Language Modelling

Paper
Add Code

textless-lib: a Library for Textless Spoken Language Processing

1 code implementation • NAACL (ACL) 2022 • Eugene Kharitonov, Jade Copet, Kushal Lakhotia, Tu Anh Nguyen, Paden Tomasello, Ann Lee, Ali Elkahky, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux, Yossi Adi

Textless spoken language processing research aims to extend the applicability of standard NLP toolset onto spoken language and languages with few or no textual resources.

Resynthesis

498

Paper
Code

Textless Speech Emotion Conversion using Discrete and Decomposed Representations

no code implementations • 14 Nov 2021 • Felix Kreuk, Adam Polyak, Jade Copet, Eugene Kharitonov, Tu-Anh Nguyen, Morgane Rivière, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux, Yossi Adi

We use a decomposition of the speech signal into discrete learned representations, consisting of phonetic-content units, prosodic features, speaker, and emotion.

Paper
Add Code

Textless Speech Emotion Conversion using Decomposed & Discrete Representations

no code implementations • arXiv 2021 • Felix Kreuk, Adam Polyak, Jade Copet, Eugene Kharitonov, Tu-Anh Nguyen, Morgane Rivière, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux, Yossi Adi

We decompose speech into discrete and disentangled learned representations, consisting of content units, F0, speaker, and emotion.

Paper
Add Code

How BPE Affects Memorization in Transformers

no code implementations • 6 Oct 2021 • Eugene Kharitonov, Marco Baroni, Dieuwke Hupkes

In this work, we demonstrate that the size of the subword vocabulary learned by Byte-Pair Encoding (BPE) greatly affects both ability and tendency of standard Transformer models to memorize training data, even when we control for the number of learned parameters.

Memorization

Paper
Add Code

Text-Free Prosody-Aware Generative Spoken Language Modeling

1 code implementation • ACL 2022 • Eugene Kharitonov, Ann Lee, Adam Polyak, Yossi Adi, Jade Copet, Kushal Lakhotia, Tu-Anh Nguyen, Morgane Rivière, Abdelrahman Mohamed, Emmanuel Dupoux, Wei-Ning Hsu

Generative Spoken Language Modeling (GSLM) \cite{Lakhotia2021} is the only prior work addressing the generative aspects of speech pre-training, which replaces text with discovered phone-like units for language modeling and shows the ability to generate meaningful novel sentences.

Language Modelling

29,286

Paper
Code

Can Transformers Jump Around Right in Natural Language? Assessing Performance Transfer from SCAN

no code implementations • EMNLP (BlackboxNLP) 2021 • Rahma Chaabouni, Roberto Dessì, Eugene Kharitonov

We present several focused modifications of Transformer that greatly improve generalization capabilities on SCAN and select one that remains on par with a vanilla Transformer on a standard machine translation (MT) task.

Machine Translation Translation

Paper
Add Code

Interpretable agent communication from scratch (with a generic visual processor emerging on the side)

1 code implementation • NeurIPS 2021 • Roberto Dessì, Eugene Kharitonov, Marco Baroni

As deep networks begin to be deployed as autonomous agents, the issue of how they can communicate with each other becomes important.

Self-Supervised Learning

280

Paper
Code

The Zero Resource Speech Challenge 2021: Spoken language modelling

no code implementations • 29 Apr 2021 • Ewan Dunbar, Mathieu Bernard, Nicolas Hamilakis, Tu Anh Nguyen, Maureen de Seyssel, Patricia Rozé, Morgane Rivière, Eugene Kharitonov, Emmanuel Dupoux

We present the Zero Resource Speech Challenge 2021, which asks participants to learn a language model directly from audio, without any text or labels.

Language Modelling

Paper
Add Code

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations

2 code implementations • 1 Apr 2021 • Adam Polyak, Yossi Adi, Jade Copet, Eugene Kharitonov, Kushal Lakhotia, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux

We propose using self-supervised discrete representations for the task of speech resynthesis.

Disentanglement Resynthesis +2

354

Paper
Code

Data Augmenting Contrastive Learning of Speech Representations in the Time Domain

1 code implementation • 2 Jul 2020 • Eugene Kharitonov, Morgane Rivière, Gabriel Synnaeve, Lior Wolf, Pierre-Emmanuel Mazaré, Matthijs Douze, Emmanuel Dupoux

Contrastive Predictive Coding (CPC), based on predicting future segments of speech based on past segments is emerging as a powerful algorithm for representation learning of speech signal.

Contrastive Learning Data Augmentation +1

627

Paper
Code

What they do when in doubt: a study of inductive biases in seq2seq learners

1 code implementation • ICLR 2021 • Eugene Kharitonov, Rahma Chaabouni

Sequence-to-sequence (seq2seq) learners are widely used, but we still have only limited knowledge about what inductive biases shape the way they generalize.

Memorization

Paper
Code

Compositionality and Generalization in Emergent Languages

1 code implementation • ACL 2020 • Rahma Chaabouni, Eugene Kharitonov, Diane Bouchacourt, Emmanuel Dupoux, Marco Baroni

Third, while compositionality is not necessary for generalization, it provides an advantage in terms of language transmission: The more compositional a language is, the more easily it will be picked up by new learners, even when the latter differ in architecture from the original agents.

Disentanglement

280

Paper
Code

Emergent Language Generalization and Acquisition Speed are not tied to Compositionality

1 code implementation • EMNLP (BlackboxNLP) 2020 • Eugene Kharitonov, Marco Baroni

Studies of discrete languages emerging when neural agents communicate to solve a joint task often look for evidence of compositional structure.

280

Paper
Code

EGG: a toolkit for research on Emergence of lanGuage in Games

no code implementations • IJCNLP 2019 • Eugene Kharitonov, Rahma Chaabouni, Diane Bouchacourt, Marco Baroni

There is renewed interest in simulating language emergence among deep neural agents that communicate to jointly solve a task, spurred by the practical aim to develop language-enabled interactive AIs, as well as by theoretical questions about the evolution of human language.

Paper
Add Code

Entropy Minimization In Emergent Languages

1 code implementation • ICML 2020 • Eugene Kharitonov, Rahma Chaabouni, Diane Bouchacourt, Marco Baroni

There is growing interest in studying the languages that emerge when neural agents are jointly trained to solve tasks requiring communication through a discrete channel.

Representation Learning

280

Paper
Code

Anti-efficient encoding in emergent communication

1 code implementation • NeurIPS 2019 • Rahma Chaabouni, Eugene Kharitonov, Emmanuel Dupoux, Marco Baroni

Despite renewed interest in emergent language simulations with neural networks, little is known about the basic properties of the induced code, and how they compare to human language.

280

Paper
Code

Word-order biases in deep-agent emergent communication

1 code implementation • ACL 2019 • Rahma Chaabouni, Eugene Kharitonov, Alessandro Lazaric, Emmanuel Dupoux, Marco Baroni

We train models to communicate about paths in a simple gridworld, using miniature languages that reflect or violate various natural language trends, such as the tendency to avoid redundancy or to minimize long-distance dependencies.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.