gpu/ocl: Allow building out the ngen kernel support #2025

nwnk · 2024-08-05T17:58:16Z

When requested, we simply return false for mayiuse_ngen_kernels and define none of the BUILD_* options for specific GPU generations. The binary kernel format is the only part of the OCL backend that is truly Intel-specific, other implementations could support the other necessary extensions, so this is a prelude to showing that the core of the engine is reusable for other vendors.

This doesn't quite work but I think it's quite close. I expect I'll need some guidance to get it across the line. ctest currently says:

The following tests FAILED:
	 34 - test_iface_attr_quantization_gpu (Failed)
	 35 - test_iface_attr_quantization_buffer_gpu (Failed)
	 56 - test_inner_product_forward_gpu (Failed)
	 57 - test_inner_product_forward_buffer_gpu (Failed)
	 68 - test_pooling_forward_gpu (Failed)
	 69 - test_pooling_forward_buffer_gpu (Failed)
	 94 - test_api_ocl (Failed)

The api test is trivial, it's just expecting that mayiuse_ngen_kernels is true. The inner product tests look like numeric accuracy issues, I'm not expert enough to know if the error is tolerable or if there's a quick fix. The other two really do look like total failures to create a primitive so some effort will probably be required to fill in what's missing. It's not immediately clear to me that the generic SYCL code wouldn't hit the same issue (I haven't tried it yet).

mgouicem · 2024-08-06T10:56:26Z

Thanks @nwnk. There is the DNNL_GPU_VENDOR variable that could be used to discard vendor specific code.
We actually have plans to expose DNNL_GPU_VENDOR=GENERIC in order to have a build of oneDNN that would eliminate vendor specifc kernels (so ngen, cublas/miopen, ...). Would that work for you or do you need finer grain controls?

@densamoilov who already did the legwork to split generic kernels from non-generic ones.

nwnk · 2024-08-06T16:06:08Z

Thanks @nwnk. There is the DNNL_GPU_VENDOR variable that could be used to discard vendor specific code. We actually have plans to expose DNNL_GPU_VENDOR=GENERIC in order to have a build of oneDNN that would eliminate vendor specifc kernels (so ngen, cublas/miopen, ...). Would that work for you or do you need finer grain controls?

That's sort of where I'm headed, yes, though I still need to write up an RFC for that. My real goal is an OpenCL/SYCL core that can be built as a generic target, but that can also be built to include specialized support for (ideally multiple) vendors. The OpenCL core would initialize and then call any appropriate vendor-specific overlay initialization based on the identified device; non-Intel devices would simply never have ngen kernels in their runtime dispatch list.

That's a lot of work to move all the code around and refactor the places where it's making vendor assumptions, so this is trying to be a first cut that lets us discover whether the non-ngen substrate is complete and working.

nwnk · 2024-08-19T20:37:31Z

Dropping this, I don't think this is the right approach.

gpu/ocl: Allow building out the ngen kernel support

3f5816e

nwnk requested review from a team as code owners August 5, 2024 17:58

vpirogov marked this pull request as draft August 12, 2024 15:35

nwnk closed this Aug 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpu/ocl: Allow building out the ngen kernel support #2025

gpu/ocl: Allow building out the ngen kernel support #2025

nwnk commented Aug 5, 2024

mgouicem commented Aug 6, 2024

nwnk commented Aug 6, 2024

nwnk commented Aug 19, 2024

gpu/ocl: Allow building out the ngen kernel support #2025

gpu/ocl: Allow building out the ngen kernel support #2025

Conversation

nwnk commented Aug 5, 2024

mgouicem commented Aug 6, 2024

nwnk commented Aug 6, 2024

nwnk commented Aug 19, 2024