Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of kernel fusion by eliminating unnecessary deep copy #214

Merged
merged 8 commits into from
Nov 23, 2023

Conversation

NaderAlAwar
Copy link
Contributor

@NaderAlAwar NaderAlAwar commented Nov 21, 2023

Remove deepcopy() from the parser when retrieving workunits to avoid unnecessary overhead. Previously it was done for every workunit AST since fusion requires that we modify the ASTs. We still deepcopy() but only for fused workunits and only after we check whether the fused workunit has already been compiled, greatly reducing the overhead by PyKokkos.

Copy link
Contributor

@gliga gliga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

@NaderAlAwar NaderAlAwar changed the title Improve performance of kernel fusion Improve performance of kernel fusion by eliminating unnecessary deep copy Nov 22, 2023
@NaderAlAwar NaderAlAwar merged commit eb2f282 into kokkos:main Nov 23, 2023
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants