no code implementations • 7 Aug 2023 • Christian Huber, Tu Anh Dinh, Carlos Mullov, Ngoc Quan Pham, Thai Binh Nguyen, Fabian Retkowski, Stefan Constantin, Enes Yavuz Ugan, Danni Liu, Zhaolin Li, Sai Koneru, Jan Niehues, Alexander Waibel
Secondly, we compare different approaches to low-latency speech translation using this framework.
no code implementations • 9 Jun 2022 • Alexander Waibel, Moritz Behr, Fevziye Irem Eyiokur, Dogucan Yaman, Tuan-Nam Nguyen, Carlos Mullov, Mehmet Arif Demirtas, Alperen Kantarcı, Stefan Constantin, Hazim Kemal Ekenel
The system is designed to combine multiple component models and produces a video of the original speaker speaking in the target language that is lip-synchronous with the target speech, yet maintains emphases in speech, voice characteristics, face video of the original speaker.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +6
no code implementations • 8 Apr 2020 • Stefan Constantin, Alex Waibel
If yes, it corrects the second last utterance according to the error correction in the last utterance and outputs the extracted pairs of reparandum and repair entity.
no code implementations • 29 Nov 2019 • Verena Heusser, Niklas Freymuth, Stefan Constantin, Alex Waibel
Speech emotion recognition is a challenging task and an important step towards more natural human-machine interaction.
no code implementations • WS 2019 • Stefan Constantin, Jan Niehues, Alex Waibel
The state-of-the-art neural network architectures make it possible to create spoken language understanding systems with high quality and fast processing time.
Natural Language Understanding Spoken Language Understanding
no code implementations • 17 Dec 2018 • Stefan Constantin, Jan Niehues, Alex Waibel
When building a neural network-based Natural Language Understanding component, one main challenge is to collect enough training data.
no code implementations • 6 Mar 2018 • Stefan Constantin, Jan Niehues, Alex Waibel
Furthermore, by using a feedforward neural network, we are able to generate the output word by word and are no longer restricted to a fixed number of possible response candidates.