You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some of the result variables in embench-iot/st are auto-declared (MeanA, MeanB, VarA, VarB,StddevA, StddevB) and at risk of the computation being removed via DCA/loop hoisting. For example, the inner loop include 3 fp divides per iteration but when armclang/6.14.1 compiles at O3 only one fp divide is performed in the whole benchmark run.
Re-declaring these as globals and calling the main loop body by function pointer seems to prevent this over-optimisation of the benchmark.
The text was updated successfully, but these errors were encountered:
I understand why Re-declaring these as globals and calling the main loop body by function pointer might prevent the over optimisation, but what other effects do the changes have? I'd expect performance to go down and code size to go up. Have you measured this?
It depends on optimisation level and whether hardware FP is used or not. For hardware FP at O3 the performance goes down 20-30% and code size goes up 20-30% because the whole workload is being executed.
The input was unchanged, it's an array of numbers which is initialised pseudo-randomly (using the same seed) on each iteration of the benchmark. In theory with aggressive enough constant propagation the compiler could compute the array contents and therefore the resulting stats calculations, but I am not seeing that!
Some of the result variables in embench-iot/st are auto-declared (MeanA, MeanB, VarA, VarB,StddevA, StddevB) and at risk of the computation being removed via DCA/loop hoisting. For example, the inner loop include 3 fp divides per iteration but when armclang/6.14.1 compiles at O3 only one fp divide is performed in the whole benchmark run.
Re-declaring these as globals and calling the main loop body by function pointer seems to prevent this over-optimisation of the benchmark.
The text was updated successfully, but these errors were encountered: