ESPNet is a convolutional neural network for semantic segmentation of high resolution images under resource constraints. ESPNet is based on a convolutional module, efficient spatial pyramid (ESP), which is efficient in terms of computation, memory, and power.
Source: ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic SegmentationPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Speech Recognition | 10 | 24.39% |
Automatic Speech Recognition (ASR) | 8 | 19.51% |
Semantic Segmentation | 5 | 12.20% |
Speech Separation | 2 | 4.88% |
Real-Time Semantic Segmentation | 2 | 4.88% |
Robust Speech Recognition | 1 | 2.44% |
Speech Enhancement | 1 | 2.44% |
Spoken Language Understanding | 1 | 2.44% |
Translation | 1 | 2.44% |
Component | Type |
|
---|---|---|
1x1 Convolution
|
Convolutions | |
Convolution
|
Convolutions | |
ESP
|
Image Model Blocks | |
Kaiming Initialization
|
Initialization | |
PReLU
|
Activation Functions |