Search Results for author: Aleksandr Laptev

Found 12 papers, 2 papers with code

The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System

no code implementations18 Oct 2023 Tae Jin Park, He Huang, Ante Jukic, Kunal Dhawan, Krishna C. Puvvada, Nithin Koluguri, Nikolay Karpov, Aleksandr Laptev, Jagadeesh Balam, Boris Ginsburg

We present the NVIDIA NeMo team's multi-channel speech recognition system for the 7th CHiME Challenge Distant Automatic Speech Recognition (DASR) Task, focusing on the development of a multi-channel, multi-speaker speech recognition system tailored to transcribe speech from distributed microphones and microphone arrays.

Automatic Speech Recognition speaker-diarization +3

Confidence-based Ensembles of End-to-End Speech Recognition Models

no code implementations27 Jun 2023 Igor Gitman, Vitaly Lavrukhin, Aleksandr Laptev, Boris Ginsburg

Second, we demonstrate that it is possible to combine base and adapted models to achieve strong results on both original and target data.

Language Identification Model Selection +2

Powerful and Extensible WFST Framework for RNN-Transducer Losses

no code implementations18 Mar 2023 Aleksandr Laptev, Vladimir Bataev, Igor Gitman, Boris Ginsburg

This paper presents a framework based on Weighted Finite-State Transducers (WFST) to simplify the development of modifications for RNN-Transducer (RNN-T) loss.

CTC Variations Through New WFST Topologies

no code implementations6 Oct 2021 Aleksandr Laptev, Somshubra Majumdar, Boris Ginsburg

This paper presents novel Weighted Finite-State Transducer (WFST) topologies to implement Connectionist Temporal Classification (CTC)-like algorithms for automatic speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset

no code implementations15 Jun 2020 Andrei Andrusenko, Aleksandr Laptev, Ivan Medennikov

This paper presents an exploration of end-to-end automatic speech recognition systems (ASR) for the largest open-source Russian language data set -- OpenSTT.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation

no code implementations14 May 2020 Aleksandr Laptev, Roman Korostik, Aleksey Svischev, Andrei Andrusenko, Ivan Medennikov, Sergey Rybin

Data augmentation is one of the most effective ways to make end-to-end automatic speech recognition (ASR) perform close to the conventional hybrid approach, especially when dealing with low-resource tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription

1 code implementation22 Apr 2020 Andrei Andrusenko, Aleksandr Laptev, Ivan Medennikov

To demonstrate this, we use the CHiME-6 Challenge data as an example of challenging environments and noisy conditions of everyday speech.

Data Augmentation Speech Enhancement +2

Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems

no code implementations19 Mar 2020 Nikolay Malkovsky, Vladimir Bataev, Dmitrii Sviridkin, Natalia Kizhaeva, Aleksandr Laptev, Ildar Valiev, Oleg Petrov

The problem of out of vocabulary words (OOV) is typical for any speech recognition system, hybrid systems are usually constructed to recognize a fixed set of words and rarely can include all the words that will be encountered during exploitation of the system.

graph construction speech-recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.