Browse SoTA > Speech > Speech Recognition

Speech Recognition

266 papers with code ยท Speech

Speech recognition is the task of recognising speech within audio and converting it into text.

( Image credit: SpecAugment )

Benchmarks

TREND DATASET BEST METHOD PAPER TITLE PAPER CODE COMPARE

Latest papers without code

Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability

30 Jul 2020

Because of its streaming nature, recurrent neural network transducer (RNN-T) is a very promising end-to-end (E2E) model that may replace the popular hybrid model for automatic speech recognition.

SPEECH RECOGNITION

Exploiting Cross-Lingual Knowledge in Unsupervised Acoustic Modeling for Low-Resource Languages

29 Jul 2020

The first problem concerns unsupervised discovery of basic (subword level) speech units in a given language.

LANGUAGE ACQUISITION SPEECH RECOGNITION

Team Deep Mixture of Experts for Distributed Power Control

28 Jul 2020

In the context of wireless networking, it was recently shown that multiple DNNs can be jointly trained to offer a desired collaborative behaviour capable of coping with a broad range of sensing uncertainties.

SPEECH RECOGNITION

Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition

27 Jul 2020

Unlike previous work on this topic, which performs on-the-fly limited-size beam-search decoding and generates alignment scores for expected edit-distance computation, in our proposed method, we re-calculate and sum scores of all the possible alignments for each hypothesis in N-best lists.

END-TO-END SPEECH RECOGNITION SPEECH RECOGNITION

Train Like a (Var)Pro: Efficient Training of Neural Networks with Variable Projection

26 Jul 2020

To solve the optimization problem more efficiently, we propose the use of variable projection (VarPro), a method originally designed for separable nonlinear least-squares problems.

IMAGE CLASSIFICATION SPEECH RECOGNITION

Video Super Resolution Based on Deep Learning: A comprehensive survey

25 Jul 2020

In recent years, deep learning has made great progress in the fields of image recognition, video analysis, natural language processing and speech recognition, including video super-resolution tasks.

SPEECH RECOGNITION VIDEO SUPER-RESOLUTION

Consistent Transcription and Translation of Speech

24 Jul 2020

To address various shortcomings of this paradigm, recent work explores end-to-end trainable direct models that translate without transcribing.

SPEECH RECOGNITION

Online spatio-temporal learning in deep neural networks

24 Jul 2020

This aspect remains in stark contrast to learning with error backpropagation through time (BPTT) applied to recurrent neural networks (RNNs), or recently even to biologically-inspired spiking neural networks (SNNs), because the unrolling through time of BPTT leads to system-locking problems.

LANGUAGE MODELLING SPEECH RECOGNITION

Applying GPGPU to Recurrent Neural Network Language Model based Fast Network Search in the Real-Time LVCSR

23 Jul 2020

Recurrent Neural Network Language Models (RNNLMs) have started to be used in various fields of speech recognition due to their outstanding performance.

LANGUAGE MODELLING LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION SPEECH RECOGNITION

Sequential Routing Framework: Fully Capsule Network-based Speech Recognition

23 Jul 2020

Each sliced window is classified to a label at the corresponding time through iterative routing mechanisms.

SPEECH RECOGNITION