Connectionist Temporal Classification Loss

A Connectionist Temporal Classification Loss, or CTC Loss, is designed for tasks where we need alignment between sequences, but where that alignment is difficult - e.g. aligning each character to its location in an audio file. It calculates a loss between a continuous (unsegmented) time series and a target sequence. It does this by summing over the probability of possible alignments of input to target, producing a loss value which is differentiable with respect to each input node. The alignment of input to target is assumed to be “many-to-one”, which limits the length of the target sequence such that it must be $\leq$ the input length.

Latest Papers

PAPER DATE
Recurrence-free unconstrained handwritten text recognition using gated fully convolutional network
Denis CoquenetClément ChatelainThierry Paquet
2020-12-09
Hybrid Transformer/CTC Networks for Hardware Efficient Voice Triggering
Saurabh AdyaVineet GargSiddharth SigtiaPramod SimhaChandra Dhir
2020-08-05
Multi-view Frequency LSTM: An Efficient Frontend for Automatic Speech Recognition
Maarten Van SegbroeckHarish MallidihBrian KingI-Fan ChenGurpreet ChadhaRoland Maas
2020-06-30
End-to-End Speech-Translation with Knowledge Distillation: [email protected]
Marco GaidoMattia Antonino Di GangiMatteo NegriMarco Turchi
2020-06-04
Reducing Spelling Inconsistencies in Code-Switching ASR using Contextualized CTC Loss
Burin NaowaratThananchai KongthawornKorrawe KarunratanakulSheng Hui WuEkapol Chuangsuwanich
2020-05-16
QuartzNet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions
Samuel KrimanStanislav BeliaevBoris GinsburgJocelyn HuangOleksii KuchaievVitaly LavrukhinRyan LearyJason LiYang Zhang
2019-10-22
LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models
Dilip Kumar MargamRohith AralikattiTanay SharmaAbhinav ThandaPujitha A KSharad RoyShankar M Venkatesan
2019-06-25
Reinterpreting CTC training as iterative fitting
| Hongzhu LiWeiqiang Wang
2019-04-24
Adversarial Generation of Handwritten Text Images Conditioned on Sequences
Eloi AlonsoBastien MoyssetRonaldo Messina
2019-03-01
CTCModel: a Keras Model for Connectionist Temporal Classification
| Yann SoullardCyprien RuffinoThierry Paquet
2019-01-23
Accurate, Data-Efficient, Unconstrained Text Recognition with Convolutional Neural Networks
| Mohamed YousefKhaled F. HussainUsama S. Mohammed
2018-12-31
Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture
Stavros PetridisThemos StafylakisPingchuan MaGeorgios TzimiropoulosMaja Pantic
2018-09-28
Deep Audio-Visual Speech Recognition
| Triantafyllos AfourasJoon Son ChungAndrew SeniorOriol VinyalsAndrew Zisserman
2018-09-06
Improved training of end-to-end attention models for speech recognition
| Albert ZeyerKazuki IrieRalf SchlüterHermann Ney
2018-05-08
Comparison of Decoding Strategies for CTC Acoustic Models
Thomas ZenkelRamon SanabriaFlorian MetzeJan NiehuesMatthias SperberSebastian StükerAlex Waibel
2017-08-15
Emotion Recognition From Speech With Recurrent Neural Networks
Vladimir ChernykhPavel Prikhodko
2017-01-27
Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition
Hagen SoltauHank LiaoHasim Sak
2016-10-31

Components

COMPONENT TYPE
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories