Skip to content

Error in conversion from CNTK to Tensorflow(Keras) Different way of padding

Jiahao Yao edited this page May 9, 2018 · 2 revisions

Error in conversion from CNTK to Tensorflow(Keras) / different way of padding

Model: "resnet18" for Imagenet

Source: CNTK

Destination: Tensorflow/Keras

Author: namizzz


Why we find this problem

When we test the CNTK parser and Keras Emitter , using the same weights in every layer.

The top 5 results of CNTK model are [(21, 8.249081), (22, 7.760076), (23, 7.4341726), (148, 7.139869), (144, 6.91873)].

The top 5 results of Keras model are [(21, 8.8325405), (22, 7.861686), (904, 7.764458), (23, 7.656951), (84, 7.2577667)].

error: 1.9934075

L1 error: 1221.8015

SNR: 3.47439985079

PSNR: 14.5286562821

It's a big error (normally e-6 or e-7 in most conversions).

Test the converted CNTK model code and Tensorflow/Keras model code and check the first layer

First, I change the line 41 in the file mmdnn/conversion/examples/keras/imagenet_test to print intermediate result of the first layer(name: Convolution96)

$ python -m mmdnn.conversion.examples.keras.imagenet_test -n resnet18_cntk.py -w resnet18_cntk.npy -i mmdnn/conversion/examples/data/seagull.jpg -s cntk -p resnet18

And I also print the result of the first layer in CNTK model, the results in this two model are totally different.

Find the reason /Test the "np.ones" weights and inputs in this two code file

The result of Keras model

[[25. 35. 35. ... 35. 30. 20.]
 [35. 49. 49. ... 49. 42. 28.]
 [35. 49. 49. ... 49. 42. 28.]
 ...
 [35. 49. 49. ... 49. 42. 28.]
 [30. 42. 42. ... 42. 36. 24.]
 [20. 28. 28. ... 28. 24. 16.]]

The result of CNTK model

[[ 48.  72.  84. ...  84.  84.  60.]
 [ 72. 108. 126. ... 126. 126.  90.]
 [ 84. 126. 147. ... 147. 147. 105.]
 ...
 [ 84. 126. 147. ... 147. 147. 105.]
 [ 84. 126. 147. ... 147. 147. 105.]
 [ 60.  90. 105. ... 105. 105.  75.]]

"16" means the result of "np.ones(4,4)" , "25" means the result of "np.ones(5,5)"

It may be caused by:

1.the matrix is rotated 180 degrees

2.the way of padding is different (as the photo below, the number 2 and 3 means the padding width)

We set the last value of the matrix from 1 to 100 to make sure why.

The result of Keras model

[[ 75.  105.  105. ...  105.   90.   60.]
[ 105.  147.  147. ...  147.  126.   84.]
[ 105.  147.  147. ...  147.  126.   84.]
...
[ 105.  147.  147. ...  147.  126.   84.]
[  90.  126.  126. ...  126. 1131. 1095.]
[  60.   84.   84. ...   84. 1095. 1071.]

The result of CNTK model

[[ 48.   72.   84. ...   84.   84.   60.]
[  72.  108.  126. ...  126.  126.   90.]
[  84.  126.  147. ...  147.  147.  105.]
...
[  84.  126.  147. ...  147.  147.  105.]
[  84.  126.  147. ...  147. 1170. 1128.]
[  60.   90.  105. ...  105. 1128. 1098.]]

So, it should be the different way of padding resulting in the different result with the use of same weight and input.