Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gpu/ocl: Allow building out the ngen kernel support #2025

Closed
wants to merge 1 commit into from

Conversation

nwnk
Copy link
Contributor

@nwnk nwnk commented Aug 5, 2024

When requested, we simply return false for mayiuse_ngen_kernels and define none of the BUILD_* options for specific GPU generations. The binary kernel format is the only part of the OCL backend that is truly Intel-specific, other implementations could support the other necessary extensions, so this is a prelude to showing that the core of the engine is reusable for other vendors.


This doesn't quite work but I think it's quite close. I expect I'll need some guidance to get it across the line. ctest currently says:

The following tests FAILED:
	 34 - test_iface_attr_quantization_gpu (Failed)
	 35 - test_iface_attr_quantization_buffer_gpu (Failed)
	 56 - test_inner_product_forward_gpu (Failed)
	 57 - test_inner_product_forward_buffer_gpu (Failed)
	 68 - test_pooling_forward_gpu (Failed)
	 69 - test_pooling_forward_buffer_gpu (Failed)
	 94 - test_api_ocl (Failed)

The api test is trivial, it's just expecting that mayiuse_ngen_kernels is true. The inner product tests look like numeric accuracy issues, I'm not expert enough to know if the error is tolerable or if there's a quick fix. The other two really do look like total failures to create a primitive so some effort will probably be required to fill in what's missing. It's not immediately clear to me that the generic SYCL code wouldn't hit the same issue (I haven't tried it yet).

@nwnk nwnk requested review from a team as code owners August 5, 2024 17:58
@mgouicem
Copy link
Contributor

mgouicem commented Aug 6, 2024

Thanks @nwnk. There is the DNNL_GPU_VENDOR variable that could be used to discard vendor specific code.
We actually have plans to expose DNNL_GPU_VENDOR=GENERIC in order to have a build of oneDNN that would eliminate vendor specifc kernels (so ngen, cublas/miopen, ...). Would that work for you or do you need finer grain controls?

  • @densamoilov who already did the legwork to split generic kernels from non-generic ones.

@nwnk
Copy link
Contributor Author

nwnk commented Aug 6, 2024

Thanks @nwnk. There is the DNNL_GPU_VENDOR variable that could be used to discard vendor specific code. We actually have plans to expose DNNL_GPU_VENDOR=GENERIC in order to have a build of oneDNN that would eliminate vendor specifc kernels (so ngen, cublas/miopen, ...). Would that work for you or do you need finer grain controls?

That's sort of where I'm headed, yes, though I still need to write up an RFC for that. My real goal is an OpenCL/SYCL core that can be built as a generic target, but that can also be built to include specialized support for (ideally multiple) vendors. The OpenCL core would initialize and then call any appropriate vendor-specific overlay initialization based on the identified device; non-Intel devices would simply never have ngen kernels in their runtime dispatch list.

That's a lot of work to move all the code around and refactor the places where it's making vendor assumptions, so this is trying to be a first cut that lets us discover whether the non-ngen substrate is complete and working.

@vpirogov vpirogov marked this pull request as draft August 12, 2024 15:35
@nwnk
Copy link
Contributor Author

nwnk commented Aug 19, 2024

Dropping this, I don't think this is the right approach.

@nwnk nwnk closed this Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants