Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] CUTLASS 3.8 does not compile for 90a with CUDA 12.6.85 #2064

Closed
manishucsd opened this issue Jan 26, 2025 · 7 comments
Closed

[BUG] CUTLASS 3.8 does not compile for 90a with CUDA 12.6.85 #2064

manishucsd opened this issue Jan 26, 2025 · 7 comments
Labels
? - Needs Triage bug Something isn't working

Comments

@manishucsd
Copy link
Contributor

cmake

/build $ cmake -DCMAKE_BUILD_TYPE:STRING=Release -DCUTLASS_NVCC_ARCHS:STRING=90a -DCUTLASS_NVCC_KEEP:STRING=OFF -DCUTLASS_ENABLE_F16C:STRING=ON -DCUTLASS_LIBRARY_KERNELS:STRING=cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_void_bf16_* -DCUTLASS_LIBRARY_IGNORE_KERNELS:STRING=gemm_grouped*,gemm_planar* -DCUTLASS_ENABLE_CUBLAS:STRING=ON -DCMAKE_EXPORT_COMPILE_COMMANDS:BOOL=TRUE -DCMAKE_C_COMPILER:FILEPATH=/usr/bin/gcc -DCMAKE_CXX_COMPILER:FILEPATH=/usr/bin/g++ --no-warn-unused-cli -S/home/manish_magic_dev/repos/cutlass/cutlass_tree_2/cutlass -B/home/manish_magic_dev/repos/cutlass/cutlass_tree_2/build
Not searching for unused variables given on the command line.
-- CMake Version: 3.31.4
-- CUTLASS 3.8.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- The CUDA compiler identification is NVIDIA 12.6.85 with host compiler GNU 11.4.0
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /home/manish_magic_dev/sdk/cuda/12.6.3/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found CUDAToolkit: /home/manish_magic_dev/sdk/cuda/12.6.3/targets/x86_64-linux/include (found version "12.6.85")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- CUDART: /home/manish_magic_dev/sdk/cuda/12.6.3/lib64/libcudart.so
-- CUDA Driver: /home/manish_magic_dev/sdk/cuda/12.6.3/lib64/stubs/libcuda.so
-- NVRTC: /home/manish_magic_dev/sdk/cuda/12.6.3/lib64/libnvrtc.so
-- Default Install Location: install
-- Found Python3: /usr/bin/python3.10 (found suitable version "3.10.12", minimum required is "3.5") found components: Interpreter
-- Make cute::tuple be the new standard-layout tuple type
-- CUDA Compilation Architectures: 90a
-- Enable caching of reference results in conv unit tests
-- Enable rigorous conv problem sizes in conv unit tests
-- Using the following NVCC flags: 
  --expt-relaxed-constexpr
  -DCUTE_USE_PACKED_TUPLE=1
  -DCUTLASS_TEST_LEVEL=0
  -DCUTLASS_TEST_ENABLE_CACHED_RESULTS=1
  -DCUTLASS_CONV_UNIT_TEST_RIGOROUS_SIZE_ENABLED=1
  -DCUTLASS_DEBUG_TRACE_LEVEL=0
  -Xcompiler=-mf16c
  -Xcompiler=-Wconversion
  -Xcompiler=-fno-strict-aliasing
fatal: not a git repository (or any of the parent directories): .git
-- CUTLASS Revision: Unable to detect, Git returned code 128.
CMake Warning (dev) at /home/manish_magic_dev/.local/lib/python3.10/site-packages/cmake/data/share/cmake-3.31/Modules/FetchContent.cmake:1953 (message):
  Calling FetchContent_Populate(googletest) is deprecated, call
  FetchContent_MakeAvailable(googletest) instead.  Policy CMP0169 can be set
  to OLD to allow FetchContent_Populate(googletest) to be called directly for
  now, but the ability to call it with declared details will be removed
  completely in a future version.
Call Stack (most recent call first):
  cmake/googletest.cmake:47 (FetchContent_Populate)
  CMakeLists.txt:756 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- The C compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Found Python3: /usr/bin/python3.10 (found version "3.10.12") found components: Interpreter
-- Configuring cublas ...
-- cuBLAS: /home/manish_magic_dev/sdk/cuda/12.6.3/lib64/libcublas.so
-- cuBLAS: /home/manish_magic_dev/sdk/cuda/12.6.3/include
-- Configuring cuBLAS ... done.
-- Completed generation of library instances. See /home/manish_magic_dev/repos/cutlass/cutlass_tree_2/build/tools/library/library_instance_generation.log for more information.
-- Found Python3: /usr/bin/python3.10 (found suitable version "3.10.12", minimum required is "3.5") found components: Interpreter
-- Enable device reference verification in conv unit tests
-- Configuring done (7.8s)
-- Generating done (0.7s)
-- Build files have been written to: /home/manish_magic_dev/repos/cutlass/cutlass_tree_2/build

compilation failure

proc] Executing command: /home/manish_magic_dev/.local/bin/cmake --build /home/manish_magic_dev/repos/cutlass/cutlass_tree_2/build --config Release --target cutlass_profiler --
[build] [  0%] Building CUDA object tools/library/CMakeFiles/cutlass_library_gemm_sm90_void_s64x128x16gemm_bf16_objs.dir/generated/gemm/90/void_s64x128x16gemm_bf16/all_sm90_void_s64x128x16gemm_bf16_gemm_operations.cu.o
[build] /home/manish_magic_dev/repos/cutlass/cutlass_tree_2/cutlass/include/cute/arch/copy_sm90_desc.hpp(223): error: identifier "CU_TENSOR_MAP_DATA_TYPE_16U6_ALIGN16B" is undefined
[build]     if constexpr (is_same_v<T, float_e2m3_t>) { return CU_TENSOR_MAP_DATA_TYPE_16U6_ALIGN16B;} else
@manishucsd
Copy link
Contributor Author

-- CUTLASS Revision: Unable to detect, Git returned code 128.
CMake Warning (dev) at /home/manish_magic_dev/.local/lib/python3.10/site-packages/cmake/data/share/cmake-3.31/Modules/FetchContent.cmake:1953 (message):
  Calling FetchContent_Populate(googletest) is deprecated, call
  FetchContent_MakeAvailable(googletest) instead.  Policy CMP0169 can be set
  to OLD to allow FetchContent_Populate(googletest) to be called directly for
  now, but the ability to call it with declared details will be removed
  completely in a future version.
Call Stack (most recent call first):
  cmake/googletest.cmake:47 (FetchContent_Populate)
  CMakeLists.txt:756 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

Unrelated to the above compilation failure, should we update googletest.cmake too and use FetchContent_MakeAvailable? cc: @d-k-b

@manishucsd
Copy link
Contributor Author

manishucsd commented Jan 26, 2025

Confirming that I am able to compile using the same cmake command with CUDA 12.8. The issue is with prior toolkits. We will stay with previous toolkits with the latest CUTLASS for sometime, so it will be great if we can fix it.

@thakkarV
Copy link
Collaborator

@hwu36 @yzhaiustc I think this needs a fix fix before tagging

@thakkarV
Copy link
Collaborator

@manishucsd thanks for the detailed bug report :)

@dtometzki
Copy link

Hello together,

perhaps is it the same
#2065

@wenlei-bao
Copy link
Contributor

+1

@hwu36
Copy link
Collaborator

hwu36 commented Jan 28, 2025

Hopefully it is fixed in #2066

@hwu36 hwu36 closed this as completed Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants