A Connectionist Temporal Classification Loss, or CTC Loss, is designed for tasks where we need alignment between sequences, but where that alignment is difficult - e.g. aligning each character to its location in an audio file. It calculates a loss between a continuous (unsegmented) time series and a target sequence. It does this by summing over the probability of possible alignments of input to target, producing a loss value which is differentiable with respect to each input node. The alignment of input to target is assumed to be “many-to-one”, which limits the length of the target sequence such that it must be $\leq$ the input length.
TASK | PAPERS | SHARE |
---|---|---|
Speech Recognition | 7 | 25.93% |
Lipreading | 3 | 11.11% |
Language Modelling | 3 | 11.11% |
Multi-Task Learning | 2 | 7.41% |
Audio-Visual Speech Recognition | 2 | 7.41% |
Visual Speech Recognition | 2 | 7.41% |
Handwriting Recognition | 1 | 3.70% |
License Plate Recognition | 1 | 3.70% |
Optical Character Recognition | 1 | 3.70% |
COMPONENT | TYPE |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |