1 code implementation • 24 Oct 2023 • Augusto Seben da Rosa, Frederico Santos de Oliveira, Anderson da Silva Soares, Arnaldo Candido Junior
Computer vision in general presented several advances such as training optimizations, new architectures (pure attention, efficient block, vision language models, generative models, among others).
no code implementations • 23 May 2023 • Lucas Rafael Stefanel Gris, Ricardo Marcacini, Arnaldo Candido Junior, Edresson Casanova, Anderson Soares, Sandra Maria Aluísio
Automatic speech recognition (ASR) systems play a key role in applications involving human-machine interactions.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 25 Nov 2022 • Daniel Peixoto Pinto da Silva, Edresson Casanova, Lucas Rafael Stefanel Gris, Arnaldo Candido Junior, Marcelo Finger, Flaviane Svartman, Beatriz Raposo, Marcus Vinícius Moreira Martins, Sandra Maria Aluísio, Larissa Cristina Berti, João Paulo Teixeira
During the outbreak of COVID-19 pandemic, several research areas joined efforts to mitigate the damages caused by SARS-CoV-2.
1 code implementation • 14 Oct 2022 • Lucas Rafael Stefanel Gris, Arnaldo Candido Junior, Vinícius G. dos Santos, Bruno A. Papa Dias, Marli Quadros Leite, Flaviane Romani Fernandes Svartman, Sandra Aluísio
The NURC Project that started in 1969 to study the cultured linguistic urban norm spoken in five Brazilian capitals, was responsible for compiling a large corpus for each capital.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 29 Mar 2022 • Edresson Casanova, Christopher Shulby, Alexander Korolev, Arnaldo Candido Junior, Anderson da Silva Soares, Sandra Aluísio, Moacir Antonelli Ponti
We explore cross-lingual multi-speaker speech synthesis and cross-lingual voice conversion applied to data augmentation for automatic speech recognition (ASR) systems in low/medium-resource scenarios.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
3 code implementations • 4 Dec 2021 • Edresson Casanova, Julian Weber, Christopher Shulby, Arnaldo Candido Junior, Eren Gölge, Moacir Antonelli Ponti
YourTTS brings the power of a multilingual approach to the task of zero-shot multi-speaker TTS.
2 code implementations • 14 Oct 2021 • Arnaldo Candido Junior, Edresson Casanova, Anderson Soares, Frederico Santos de Oliveira, Lucas Oliveira, Ricardo Corso Fernandes Junior, Daniel Peixoto Pinto da Silva, Fernando Gorgulho Fayet, Bruno Baldissera Carlotto, Lucas Rafael Stefanel Gris, Sandra Maria Aluísio
with 290. 77 hours, a publicly available dataset for ASR in BP containing validated pairs (audio-transcription).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 23 Jul 2021 • Lucas Rafael Stefanel Gris, Edresson Casanova, Frederico Santos de Oliveira, Anderson da Silva Soares, Arnaldo Candido Junior
In this sense, this work presents the development of an public Automatic Speech Recognition (ASR) system using only open available audio data, from the fine-tuning of the Wav2vec 2. 0 XLSR-53 model pre-trained in many languages, over BP data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
2 code implementations • 2 Apr 2021 • Edresson Casanova, Christopher Shulby, Eren Gölge, Nicolas Michael Müller, Frederico Santos de Oliveira, Arnaldo Candido Junior, Anderson da Silva Soares, Sandra Maria Aluisio, Moacir Antonelli Ponti
In this paper, we propose SC-GlowTTS: an efficient zero-shot multi-speaker text-to-speech model that improves similarity for speakers unseen during training.
1 code implementation • 11 May 2020 • Edresson Casanova, Arnaldo Candido Junior, Christopher Shulby, Frederico Santos de Oliveira, João Paulo Teixeira, Moacir Antonelli Ponti, Sandra Maria Aluisio
Speech provides a natural way for human-computer interaction.
2 code implementations • 25 Feb 2020 • Edresson Casanova, Arnaldo Candido Junior, Christopher Shulby, Frederico Santos de Oliveira, Lucas Rafael Stefanel Gris, Hamilton Pereira da Silva, Sandra Maria Aluisio, Moacir Antonelli Ponti
We compare the three best architectures trained using our method to select the best one, which is the one with a shallow architecture.