Search Results for author: Ahmed Hussen Abdelaziz

Found 8 papers, 1 papers with code

ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

2 code implementations • 30 Jan 2024 • Jee-weon Jung, Wangyou Zhang, Jiatong Shi, Zakaria Aldeneh, Takuya Higuchi, Barry-John Theobald, Ahmed Hussen Abdelaziz, Shinji Watanabe

First, we provide an open-source platform for researchers in the speaker recognition community to effortlessly build models.

Ranked #1 on Speaker Verification on VoxCeleb (using extra training data)

Self-Supervised Learning Speaker Recognition +1

7,867

Paper
Code

Modality Dropout for Multimodal Device Directed Speech Detection using Verbal and Non-Verbal Features

no code implementations • 23 Oct 2023 • Gautam Krishna, Sameer Dharur, Oggi Rudovic, Pranay Dighe, Saurabh Adya, Ahmed Hussen Abdelaziz, Ahmed H Tewfik

Device-directed speech detection (DDSD) is the binary classification task of distinguishing between queries directed at a voice assistant versus side conversation or background speech.

Automatic Speech Recognition Binary Classification +2

Paper
Add Code

Audiovisual Speech Synthesis using Tacotron2

no code implementations • 3 Aug 2020 • Ahmed Hussen Abdelaziz, Anushree Prasanna Kumar, Chloe Seivwright, Gabriele Fanelli, Justin Binder, Yannis Stylianou, Sachin Kajarekar

The output acoustic features are used to condition a WaveRNN to reconstruct the speech waveform, and the output facial controllers are used to generate the corresponding video of the talking face.

Face Model Sentence +1

Paper
Add Code

Modality Dropout for Improved Performance-driven Talking Faces

no code implementations • 27 May 2020 • Ahmed Hussen Abdelaziz, Barry-John Theobald, Paul Dixon, Reinhard Knothe, Nicholas Apostoloff, Sachin Kajareker

We use subjective testing to demonstrate: 1) the improvement of audiovisual-driven animation over the equivalent video-only approach, and 2) the improvement in the animation of speech-related facial movements after introducing modality dropout.

Paper
Add Code

On the Role of Visual Cues in Audiovisual Speech Enhancement

no code implementations • 25 Apr 2020 • Zakaria Aldeneh, Anushree Prasanna Kumar, Barry-John Theobald, Erik Marchi, Sachin Kajarekar, Devang Naik, Ahmed Hussen Abdelaziz

One byproduct of this finding is that the learned visual embeddings can be used as features for other visual speech applications.

Self-Supervised Learning Speech Enhancement

Paper
Add Code

On Neural Phone Recognition of Mixed-Source ECoG Signals

no code implementations • 12 Dec 2019 • Ahmed Hussen Abdelaziz, Shuo-Yiin Chang, Nelson Morgan, Erik Edwards, Dorothea Kolossa, Dan Ellis, David A. Moses, Edward F. Chang

The emerging field of neural speech recognition (NSR) using electrocorticography has recently attracted remarkable research interest for studying how human brains recognize speech in quiet and noisy surroundings.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Speaker-Independent Speech-Driven Visual Speech Synthesis using Domain-Adapted Acoustic Models

no code implementations • 15 May 2019 • Ahmed Hussen Abdelaziz, Barry-John Theobald, Justin Binder, Gabriele Fanelli, Paul Dixon, Nicholas Apostoloff, Thibaut Weise, Sachin Kajareker

We conclude that visual speech synthesis can significantly benefit from the powerful representation of speech in the ASR acoustic models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

The Tutorbot Corpus --- A Corpus for Studying Tutoring Behaviour in Multiparty Face-to-Face Spoken Dialogue

no code implementations • LREC 2014 • Maria Koutsombogera, Samer Al Moubayed, Bajibabu Bollepalli, Ahmed Hussen Abdelaziz, Martin Johansson, Jos{\'e} David Aguas Lopes, Jekaterina Novikova, Catharine Oertel, Kalin Stefanov, G{\"u}l Varol

The corpus is targeted and designed towards the development of a dialogue system platform to explore verbal and nonverbal tutoring strategies in multiparty spoken interactions.

Spoken Dialogue Systems

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.