Identity Mappings in Deep Residual Networks

16 Mar 2016  ·  Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun ·

Deep residual networks have emerged as a family of extremely deep architectures showing compelling accuracy and nice convergence behaviors. In this paper, we analyze the propagation formulations behind the residual building blocks, which suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation. A series of ablation experiments support the importance of these identity mappings. This motivates us to propose a new residual unit, which makes training easier and improves generalization. We report improved results using a 1001-layer ResNet on CIFAR-10 (4.62% error) and CIFAR-100, and a 200-layer ResNet on ImageNet. Code is available at: https://github.com/KaimingHe/resnet-1k-layers

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Image Classification CIFAR-10 ResNet-1001 Percentage correct 95.4 # 122
Image Classification CIFAR-100 ResNet-1001 Percentage correct 77.3 # 137
Image Classification ImageNet ResNet-200 Top 1 Accuracy 79.9% # 671
Image Classification Kuzushiji-MNIST PreActResNet-18 Accuracy 97.82 # 17

Methods