-
Notifications
You must be signed in to change notification settings - Fork 965
[implict zeropadding of Avgpool] Error in inception v3 conversion from Pytorch to Keras
Model: "Inception V3" for Imagenet
Source: Pytorch
Destination: Keras
Author: Jiahao
When we test the CNTK parser and Keras Emitter , we find there is a big error in the final results. Then, we track the conversion of every layer of our model. We find the conv
, bn
and Maxpool
layer correct, the conversion fo them gives almost the same output values of the original model. Then, we go on to observe the output after the avgpool layer. We find the result greatly different, especially those entries near the edges. Luckily, I do arithmetic division
of the top-left element of the output between the converted model and original model. No matter what input I give, the division remains always the same, namely 4/9.
The original avgpool in Pytorch is average pooling with implicit padding. That means when the kernel of pooling exceeds the edge, it will only count the number of pixels within the image. However, in our conversion to keras, we first emit the padding (Adding an extra padding layer). After the padding, we then do the average pooling. This way, it will count the number of pixels outside the image edges. In one word, The division denominator around the edges is different, which can be seen more clearly from the illustration below. It is the first Average pooling counting from the beginning, taking the input of 35x35 with the kernel size of 3 and padding size of 1. We will see the difference of the top-left element of the output.
Therefore, in order to solve this problem, we have to be more carefully when convert the padding. However, the difficulty lies in that not all framework have the same format of pooling
, or convolution
layer. That is why we originally emit the padding
first and then do the pooling
.
The issue is also discussed in Mxnet issue #10194