Search Results for author: Ron Hoory

Found 11 papers, 1 papers with code

Creating an African American-Sounding TTS: Guidelines, Technical Challenges,and Surprising Evaluations

no code implementations17 Mar 2024 Claudio Pinhanez, Raul Fernandez, Marcelo Grave, Julio Nogima, Ron Hoory

Representations of AI agents in user interfaces and robotics are predominantly White, not only in terms of facial and skin features, but also in the synthetic voices they use.

Attribute

Speak While You Think: Streaming Speech Synthesis During Text Generation

no code implementations20 Sep 2023 Avihu Dekel, Slava Shechtman, Raul Fernandez, David Haws, Zvi Kons, Ron Hoory

Experimental results show that LLM2Speech maintains the teacher's quality while reducing the latency to enable natural conversations.

Speech Synthesis Text Generation

Towards a Common Speech Analysis Engine

no code implementations1 Mar 2022 Hagai Aronowitz, Itai Gat, Edmilson Morais, Weizhong Zhu, Ron Hoory

Beyond that, a common engine should be capable of supporting distributed training with client in-house private data.

Emotion Recognition Language Identification +1

Speech Emotion Recognition using Self-Supervised Features

no code implementations ICASSP 2022 Edmilson Morais, Ron Hoory, Weizhong Zhu, Itai Gat, Matheus Damasceno, Hagai Aronowitz

Self-supervised pre-trained features have consistently delivered state-of-art results in the field of natural language processing (NLP); however, their merits in the field of speech emotion recognition (SER) still need further investigation.

Speech Emotion Recognition

Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems

no code implementations8 Oct 2020 Yinghui Huang, Hong-Kwang Kuo, Samuel Thomas, Zvi Kons, Kartik Audhkhasi, Brian Kingsbury, Ron Hoory, Michael Picheny

Assuming we have additional text-to-intent data (without speech) available, we investigated two techniques to improve the S2I system: (1) transfer learning, in which acoustic embeddings for intent classification are tied to fine-tuned BERT text embeddings; and (2) data augmentation, in which the text-to-intent data is converted into speech-to-intent data using a multi-speaker text-to-speech system.

Data Augmentation intent-classification +2

End-to-End Spoken Language Understanding Without Full Transcripts

no code implementations30 Sep 2020 Hong-Kwang J. Kuo, Zoltán Tüske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis Lastras

For our speech-to-entities experiments on the ATIS corpus, both the CTC and attention models showed impressive ability to skip non-entity words: there was little degradation when trained on just entities versus full transcripts.

slot-filling Slot Filling +3

Siamese x-vector reconstruction for domain adapted speaker recognition

no code implementations28 Jul 2020 Shai Rozenberg, Hagai Aronowitz, Ron Hoory

With the rise of voice-activated applications, the need for speaker recognition is rapidly increasing.

Domain Adaptation Speaker Recognition

High quality, lightweight and adaptable TTS using LPCNet

no code implementations2 May 2019 Zvi Kons, Slava Shechtman, Alex Sorin, Carmel Rabinovitz, Ron Hoory

We first demonstrate the ability of the system to produce high quality speech when trained on large, high quality datasets.

Audio and Speech Processing Sound

Cannot find the paper you are looking for? You can Submit a new open access paper.