Search Results for author: David Gimeno-Gómez

Found 6 papers, 3 papers with code

AnnoTheia: A Semi-Automatic Annotation Toolkit for Audio-Visual Speech Technologies

1 code implementation20 Feb 2024 José-M. Acosta-Triana, David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos

In order to promote research on low-resource languages for audio-visual speech technologies, we present AnnoTheia, a semi-automatic annotation toolkit that detects when a person speaks on the scene and the corresponding transcription.

Comparison of Conventional Hybrid and CTC/Attention Decoders for Continuous Visual Speech Recognition

no code implementations20 Feb 2024 David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos

Thanks to the rise of deep learning and the availability of large-scale audio-visual databases, recent advances have been achieved in Visual Speech Recognition (VSR).

Decoder speech-recognition +1

Analysis of Visual Features for Continuous Lipreading in Spanish

no code implementations21 Nov 2023 David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos

In this paper, we propose an analysis of different speech visual features with the intention of identifying which of them is the best approach to capture the nature of lip movements for natural Spanish and, in this way, dealing with the automatic visual speech recognition task.

Lipreading speech-recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.