-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache Triton compilation artifacts during CI. #3680
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
294f7d1
to
a17e207
Compare
chicheng
reviewed
Apr 17, 2024
antiagainst
approved these changes
Apr 17, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! LGTM; just a few nits. Sorry I said I'd look into this but gotten distracted by other tasks..
The vast majority of CI time is spent compiling Triton kernels. This caches those, so if the compiler doesn't change, we should be able to reuse the cached kernels. OTOH if the compiler does change, Triton will not (or, should not) use stale compilation artifacts, because the Triton source code is effectively part of the cache key. Because the Triton C++ compiled code is part of the cache key, this will only work if the Triton C++ build is deterministic. We should be able to see if caching is working by comparing the names of the compilation artifacts between two nop commits to `main`. While we're here, we also run pre-commit on changes in the main branch. Previously we only ran it on pull requests. The reason for this is that if we don't run it on main, the pip caching won't work on non-main branches (because the cache action only pulls from the current branch or main). Also while we're here, rewrite the AMD workflow definition to reference the nvidia workflow definition where they match -- it was getting hard to keep the two workflows in sync. Because github actions don't support YAML references, I had to add a preprocessing step to pre-commit. But it's not too bad.
335075b
to
5ea8de6
Compare
5ea8de6
to
52b1708
Compare
Thank you for the reviews! Let's see how this goes on |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The vast majority of CI time is spent compiling Triton kernels. This caches
those, so if the compiler doesn't change, we should be able to reuse the cached
kernels. OTOH if the compiler does change, Triton will not (or, should not)
use stale compilation artifacts, because the Triton source code is effectively
part of the cache key.
Because the Triton C++ compiled code is part of the cache key, this will only
work if the Triton C++ build is deterministic. We should be able to see if
caching is working by comparing the names of the compilation artifacts between
two nop commits to
main
.While we're here, we also run pre-commit on changes in the main branch.
Previously we only ran it on pull requests. The reason for this is that if we
don't run it on main, the pip caching won't work on non-main branches (because
the cache action only pulls from the current branch or main).
Also while we're here, rewrite the AMD workflow definition to reference the
nvidia workflow definition where they match -- it was getting hard to keep the
two workflows in sync. Because github actions don't support YAML references, I
had to add a preprocessing step to pre-commit. But it's not too bad.