Skip to content

Simplest gemm example with 3.x APIs #1742

Answered by thakkarV
dukallis asked this question in Q&A
Discussion options

You must be logged in to vote

Generally, I am interested whether it's possible to construct sgemm or convolution using new 3.x Collective, Kernel and Device APIs provided that I have underlying CuTe atoms specified correctly and then applied make_tiled_mma and make_tiled_copy to them?

Yes. Please see https://github.com/NVIDIA/cutlass/blob/main/test/unit/gemm/device/default_gemm_configuration.hpp for inspiration. A similar template config can be used for Volta/Turing and they should just work OOTB. We have some of these kernels internally that maybe @ccecka and I can work on upstreaming as single file examples in the future

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@dukallis
Comment options

@WhoisZihan
Comment options

@thakkarV
Comment options

@thakkarV
Comment options

Answer selected by dukallis
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants