Re-implement SYCL backend parallel_for
to improve bandwidth utilization
#665
Job | Run time |
---|---|
15s | |
6m 23s | |
20m 6s | |
15m 34s | |
19m 7s | |
24m 44s | |
11m 16s | |
12m 24s | |
9m 59s | |
6m 32s | |
9m 54s | |
9m 55s | |
11m 8s | |
8m 24s | |
11m 28s | |
12m 2s | |
5m 33s | |
27m 46s | |
18m 22s | |
18m 14s | |
4h 19m 6s |