- Title: Identity Mappings in Deep Residual Networks
- Authors: Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
- Link: http://arxiv.org/abs/1603.05027v2
- Tags: Neural Network, residual
- Year: 2016
-
What
- The authors reevaluate the original residual design of neural networks.
- They compare various architectures of residual units and actually find one that works quite a bit better.
-
How
- The new variation starts the transformation branch of each residual unit with BN and a ReLU.
- It removes BN and ReLU after the last convolution.
- As a result, the information from previous layers can flow completely unaltered through the shortcut branch of each residual unit.
- The image below shows some variations (of the position of BN and ReLU) that they tested. The new and better design is on the right:
- They also tried various alternative designs for the shortcut connections. However, all of these designs performed worse than the original one. Only one (d) came close under certain conditions. Therefore, the recommendation is to stick with the old/original design.
-
Results