Search Results for author: Zvi Kons

Found 7 papers, 1 papers with code

Speak While You Think: Streaming Speech Synthesis During Text Generation

no code implementations20 Sep 2023 Avihu Dekel, Slava Shechtman, Raul Fernandez, David Haws, Zvi Kons, Ron Hoory

Experimental results show that LLM2Speech maintains the teacher's quality while reducing the latency to enable natural conversations.

Speech Synthesis Text Generation

Extending RNN-T-based speech recognition systems with emotion and language classification

no code implementations28 Jul 2022 Zvi Kons, Hagai Aronowitz, Edmilson Morais, Matheus Damasceno, Hong-Kwang Kuo, Samuel Thomas, George Saon

We propose using a recurrent neural network transducer (RNN-T)-based speech-to-text (STT) system as a common component that can be used for emotion recognition and language identification as well as for speech recognition.

Emotion Classification Emotion Recognition +3

Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems

no code implementations8 Oct 2020 Yinghui Huang, Hong-Kwang Kuo, Samuel Thomas, Zvi Kons, Kartik Audhkhasi, Brian Kingsbury, Ron Hoory, Michael Picheny

Assuming we have additional text-to-intent data (without speech) available, we investigated two techniques to improve the S2I system: (1) transfer learning, in which acoustic embeddings for intent classification are tied to fine-tuned BERT text embeddings; and (2) data augmentation, in which the text-to-intent data is converted into speech-to-intent data using a multi-speaker text-to-speech system.

Data Augmentation intent-classification +2

End-to-End Spoken Language Understanding Without Full Transcripts

no code implementations30 Sep 2020 Hong-Kwang J. Kuo, Zoltán Tüske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis Lastras

For our speech-to-entities experiments on the ATIS corpus, both the CTC and attention models showed impressive ability to skip non-entity words: there was little degradation when trained on just entities versus full transcripts.

slot-filling Slot Filling +3

High quality, lightweight and adaptable TTS using LPCNet

no code implementations2 May 2019 Zvi Kons, Slava Shechtman, Alex Sorin, Carmel Rabinovitz, Ron Hoory

We first demonstrate the ability of the system to produce high quality speech when trained on large, high quality datasets.

Audio and Speech Processing Sound

Cannot find the paper you are looking for? You can Submit a new open access paper.