HRNet, or High-Resolution Net, is a general purpose convolutional neural network for tasks like semantic segmentation, object detection and image classification. It is able to maintain high resolution representations through the whole process. We start from a high-resolution convolution stream, gradually add high-to-low resolution convolution streams one by one, and connect the multi-resolution streams in parallel. The resulting network consists of several ($4$ in the paper) stages and the $n$th stage contains $n$ streams corresponding to $n$ resolutions. The authors conduct repeated multi-resolution fusions by exchanging the information across the parallel streams over and over.
Source: Deep High-Resolution Representation Learning for Visual RecognitionPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Pose Estimation | 22 | 16.18% |
Semantic Segmentation | 21 | 15.44% |
Image Classification | 5 | 3.68% |
2D Human Pose Estimation | 5 | 3.68% |
Image Segmentation | 4 | 2.94% |
Multi-Person Pose Estimation | 4 | 2.94% |
Depth Estimation | 3 | 2.21% |
Retrieval | 2 | 1.47% |
3D Hand Pose Estimation | 2 | 1.47% |
Component | Type |
|
---|---|---|
Batch Normalization
|
Normalization | |
Convolution
|
Convolutions | |
ReLU
|
Activation Functions | |
Residual Connection
|
Skip Connections |