-
Notifications
You must be signed in to change notification settings - Fork 965
Error in conversion from CNTK to Tensorflow(Keras) Different way of padding
Model: "resnet18" for Imagenet
Source: CNTK
Destination: Tensorflow/Keras
Author: namizzz
When we test the CNTK parser and Keras Emitter , using the same weights in every layer.
The top 5 results of CNTK model are [(21, 8.249081), (22, 7.760076), (23, 7.4341726), (148, 7.139869), (144, 6.91873)].
The top 5 results of Keras model are [(21, 8.8325405), (22, 7.861686), (904, 7.764458), (23, 7.656951), (84, 7.2577667)].
error: 1.9934075
L1 error: 1221.8015
SNR: 3.47439985079
PSNR: 14.5286562821
It's a big error (normally e-6 or e-7 in most conversions).
First, I change the line 41 in the file mmdnn/conversion/examples/keras/imagenet_test to print intermediate result of the first layer(name: Convolution96)
$ python -m mmdnn.conversion.examples.keras.imagenet_test -n resnet18_cntk.py -w resnet18_cntk.npy -i mmdnn/conversion/examples/data/seagull.jpg -s cntk -p resnet18
And I also print the result of the first layer in CNTK model, the results in this two model are totally different.
The result of Keras model
[[25. 35. 35. ... 35. 30. 20.]
[35. 49. 49. ... 49. 42. 28.]
[35. 49. 49. ... 49. 42. 28.]
...
[35. 49. 49. ... 49. 42. 28.]
[30. 42. 42. ... 42. 36. 24.]
[20. 28. 28. ... 28. 24. 16.]]
The result of CNTK model
[[ 48. 72. 84. ... 84. 84. 60.]
[ 72. 108. 126. ... 126. 126. 90.]
[ 84. 126. 147. ... 147. 147. 105.]
...
[ 84. 126. 147. ... 147. 147. 105.]
[ 84. 126. 147. ... 147. 147. 105.]
[ 60. 90. 105. ... 105. 105. 75.]]
"16" means the result of "np.ones(4,4)" , "25" means the result of "np.ones(5,5)"
It may be caused by:
1.the matrix is rotated 180 degrees
2.the way of padding is different (as the photo below, the number 2 and 3 means the padding width)
We set the last value of the matrix from 1 to 100 to make sure why.
The result of Keras model
[[ 75. 105. 105. ... 105. 90. 60.]
[ 105. 147. 147. ... 147. 126. 84.]
[ 105. 147. 147. ... 147. 126. 84.]
...
[ 105. 147. 147. ... 147. 126. 84.]
[ 90. 126. 126. ... 126. 1131. 1095.]
[ 60. 84. 84. ... 84. 1095. 1071.]
The result of CNTK model
[[ 48. 72. 84. ... 84. 84. 60.]
[ 72. 108. 126. ... 126. 126. 90.]
[ 84. 126. 147. ... 147. 147. 105.]
...
[ 84. 126. 147. ... 147. 147. 105.]
[ 84. 126. 147. ... 147. 1170. 1128.]
[ 60. 90. 105. ... 105. 1128. 1098.]]
So, it should be the different way of padding resulting in the different result with the use of same weight and input.