Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation for virtual functions #551

Merged
merged 20 commits into from
Oct 1, 2024

Conversation

pzehner
Copy link
Contributor

@pzehner pzehner commented Jul 9, 2024

This PR aims to deeply rework the documentation relative to virtual functions:

  • Removed mixed Cuda/Kokkos code;
  • Start from a serial example case;
  • Improve kokkos_malloc code: the documentation used Kokkos::kokkos_malloc in a way that couldn't compile. A static_cast was needed. In the same Slack discussion, it was suggested to use a more precise formulation (separate allocated memory and actual object);
  • Make the code more portable: only resort to Kokkos features;
  • Make examples easier to build: most examples were given as "pseudo-code", this was changed to actual code, with comments to explicit what is done;
  • List backends for which virtual functions in parallel region work;
  • Display a warning about the use of virtual functions in general at the beginning of the page; and
  • Fix typos.

- Fix typos
- Fix variable names
- Improve formulation for Complications and Fixes
- Improve kokkos_malloc code
@pzehner pzehner self-assigned this Jul 9, 2024
Copy link
Contributor

@ajpowelsnl ajpowelsnl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entry is rather confusing, not least of all because there's co-mingling of Kokkos and CUDA code. If the goal is to provide an explanation of virtual functions in Kokkos (and how to access them correctly from host and device), this entry takes the long way around the proverbial barn.

A few more suggestions:

  • Point users to a couple quality C++ resources on virtual functions (you already have one)
  • Remove "hybrid" code
  • Remove code that is intentionally incorrect
  • Focus on the correct Kokkos code (making sure the example will compile and run)
  • Brief explanation of why this works (i.e., Virtual Tables, Virtual pointers)
  • Maybe mention "pitfalls" to alert the reader to the fact that derived class methods on host and device, respectively, have strict access patterns, where one (host) cannot access the other (device)

@cedricchevalier19
Copy link
Member

This entry is rather confusing, not least of all because there's co-mingling of Kokkos and CUDA code. If the goal is to provide an explanation of virtual functions in Kokkos (and how to access them correctly from host and device), this entry takes the long way around the proverbial barn.

I think @pzehner's goal was to update the page to make the examples work. However, I agree with @ajpowelsnl that this page is quite complicated to read, and I do not think the middle section with explicit Cuda code is very appealing to a Kokkos user.

More generally, do we want to advertise using virtual functions on GPU?

@JBludau
Copy link
Contributor

JBludau commented Jul 12, 2024

More generally, do we want to advertise using virtual functions on GPU?

I would prefer not advising the use of virtual functions on GPUs. But I had the feeling, the page did show how it can be done while expressing the difficulties that this brings to the table in quite a neutral tone

@pzehner
Copy link
Contributor Author

pzehner commented Jul 16, 2024

This entry is rather confusing, not least of all because there's co-mingling of Kokkos and CUDA code. If the goal is to provide an explanation of virtual functions in Kokkos (and how to access them correctly from host and device), this entry takes the long way around the proverbial barn.

@ajpowelsnl As stated in the description of this PR, this contribution is an improvement of the existing documentation page. Its pedagogical approach (i.e. presenting the problem with a failing mixed CUDA-Kokkos code, then iterating to a more Kokkos solution) wasn't changed. Many remarks that you made about this PR are more remarks about the original page. My objectives were clearly stated at the beginning of this discussion.

Now we're here, we can discuss about changing the page more deeply.

  • Point users to a couple quality C++ resources on virtual functions (you already have one)

I don't think more resources would be necessary. If a developer even accesses this documentation page, it's reasonable to consider they know what a virtual function is. If you have any interesting reference, feel free to tell me, however.

  • Remove "hybrid" code

The approach of the original article is indeed weird: nobody mixes CUDA and Kokkos code directly like that. I think it would make more sense to start from a plain serial code and get a portable(-ish) Kokkos code. Or from a full CUDA code and get a Kokkos code (which is maybe the original motivation for this page).

  • Remove code that is intentionally incorrect

Naive approaches could be just described, instead of being presented in a code block.

  • Focus on the correct Kokkos code (making sure the example will compile and run)

This PR makes the Kokkos code at least buildable and proposes a full Kokkos approach.

  • Brief explanation of why this works (i.e., Virtual Tables, Virtual pointers)

I believe the page already explains it.

Maybe mention "pitfalls" to alert the reader to the fact that derived class methods on host and device, respectively, have strict access patterns, where one (host) cannot access the other (device)

This could be improved in text, indeed.

The illustrations, if I got them right, already show what is possible and not.

More generally, do we want to advertise using virtual functions on GPU?

I would prefer not advising the use of virtual functions on GPUs. But I had the feeling, the page did show how it can be done while expressing the difficulties that this brings to the table in quite a neutral tone

I think we should warn the reader at the very beginning of the article that using virtual functions in parallelized regions is a bad idea in the first place. It is known to have bad performance, it's inconvenient and ugly to use on GPU... @masterleinad confirmed me on Slack that this practice is discouraged.

I'll add such a warning at the beginning of the article.

@pzehner
Copy link
Contributor Author

pzehner commented Jul 16, 2024

According to @masterleinad, using virtual functions on device is not possible with SYCL.

@masterleinad
Copy link
Contributor

According to @masterleinad, using virtual functions on device is not possible with SYCL.

The most recent approach is explained in https://github.com/AlexeySachkov/llvm/blob/private/asachkov/virtual-functions-extension-spec/sycl/doc/design/VirtualFunctions.md which basically requires annotating virtual functions and queues.

@pzehner
Copy link
Contributor Author

pzehner commented Jul 18, 2024

I profoundly modified the documentation page to remove mixed Cuda/Kokkos code and start with serial code.

It'd be better to read the page in its entirety, not only the diffs that are hard to read.

Co-authored-by: Thomas Padioleau <[email protected]>
Copy link
Contributor

@JBludau JBludau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have minor nitpicking

Copy link
Member

@tpadioleau tpadioleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the page quite clear!

Copy link
Contributor

@JBludau JBludau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@pzehner pzehner merged commit c1c3a19 into kokkos:main Oct 1, 2024
2 checks passed
@pzehner pzehner deleted the update/virtual_function branch October 1, 2024 08:17
@dalg24
Copy link
Member

dalg24 commented Oct 1, 2024

Please squash merge next time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants