Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial support blackwell #487

Merged
merged 1 commit into from
Jan 25, 2025
Merged

initial support blackwell #487

merged 1 commit into from
Jan 25, 2025

Conversation

johnnynunez
Copy link
Contributor

10.0 blackwell b100/b200
12.0 blackwell rtx50

@Tom94
Copy link
Collaborator

Tom94 commented Jan 22, 2025

Hi, thanks a bunch for the PR!

Any chance you could also update the list of supported compute capabilities at the top of this file as well as the min/max compute capability conditions here?

@johnnynunez
Copy link
Contributor Author

Hi, thanks a bunch for the PR!

Any chance you could also update the list of supported compute capabilities at the top of this file as well as the min/max compute capability conditions here?

Yeah for sure, I will add it

@johnnynunez
Copy link
Contributor Author

Done @Tom94

@johnnynunez
Copy link
Contributor Author

It's normal failing because is not public yet...

bindings/torch/setup.py Outdated Show resolved Hide resolved
@Tom94
Copy link
Collaborator

Tom94 commented Jan 23, 2025

Wait, I thought this PR was based on already available new CUDA versions. Are you saying the 12.8 version is pure speculation?

I tried searching for it just now and couldn't find anything. Also what's your source for the version 13.0 condition you added to the minimum compute capability?

@johnnynunez
Copy link
Contributor Author

I tried searching for it just now and couldn't find anything. Also what's your source for the version 13.0 condition you added to the minimum compute capability?

13.0 is speculation. But the other thing is true. I attach here references:
Dao-AILab/flash-attention#1436

@Tom94
Copy link
Collaborator

Tom94 commented Jan 23, 2025

I'll hold off on updating tiny-cuda-nn until there's something official then.

@johnnynunez
Copy link
Contributor Author

johnnynunez commented Jan 23, 2025

I'll hold off on updating tiny-cuda-nn until there's something official then.

perfect, PR is ready when CUDA 12.8 it comes! Thanks for this library :)

@johnnynunez
Copy link
Contributor Author

johnnynunez commented Jan 23, 2025

Cuda 12.8 is out
whitepaper: https://docs.nvidia.com/cuda/pdf/ptx_isa_8.7.pdf
image

@johnnynunez
Copy link
Contributor Author

@Tom94 merge :)

@johnnynunez
Copy link
Contributor Author

@Tom94 nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).

@johnnynunez
Copy link
Contributor Author

@Tom94 nvcc warning : Support for offline compilation for architectures prior to '<compute/sm/lto>_75' will be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).

yo yes... only gpus with tensor cores will be supported +turing

@Tom94
Copy link
Collaborator

Tom94 commented Jan 24, 2025

Please still remove the 13.0 part. We've got no clear info which precise version will actually drop support for <75, even if it is deprecated. Thanks.

@johnnynunez
Copy link
Contributor Author

Please still remove the 13.0 part. We've got no clear info which precise version will actually drop support for <75, even if it is deprecated. Thanks.

Done!

@johnnynunez
Copy link
Contributor Author

@Tom94 fixed typo!

@Tom94 Tom94 merged commit 394bfd2 into NVlabs:master Jan 25, 2025
21 checks passed
@johnnynunez
Copy link
Contributor Author

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants