[Enhancement][Torch] Using torch cpp_extension for kernel integration instead of dl_pack #7

LeiWang1999 · 2025-01-18T05:07:14Z

Our project currently uses dlpack to cast Torch tensors to/from TVM runtime arguments. However, this approach introduces noticeable runtime overhead, as discussed here: Strange Overhead of TVM Runtime NDArray from DLPack.

For reference, some projects like BitBLAS implement custom CUDA-based solutions with ctype wrappers to avoid such overhead. However, these approaches often lack comprehensive support for handling tensor attributes, such as shape, strides, and data type.

A more appropriate and efficient solution would be to leverage the Torch C++ extension to directly bridge Torch tensors and TVM NDArray objects without introducing the overhead of dlpack conversions. This approach would maintain tensor attributes while improving runtime efficiency.

We should explore the feasibility of implementing this integration with Torch’s C++ extensions to mitigate the current performance bottleneck.

LeiWang1999 · 2025-01-18T17:18:03Z

Pull Request #12 introduced a JIT (Just-In-Time) component for TileLang, streamlining its functionality. Cpp extension can be built on top of this component to further enhance its capabilities.

LeiWang1999 · 2025-01-19T12:54:46Z

cpp extension takes too much time for compilation, should checkout https://discuss.pytorch.org/t/speeding-up-c-cuda-extension-build-time/96151

LeiWang1999 added the enhancement New feature or request label Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement][Torch] Using torch cpp_extension for kernel integration instead of dl_pack #7

[Enhancement][Torch] Using torch cpp_extension for kernel integration instead of dl_pack #7

LeiWang1999 commented Jan 18, 2025

LeiWang1999 commented Jan 18, 2025

LeiWang1999 commented Jan 19, 2025

[Enhancement][Torch] Using torch cpp_extension for kernel integration instead of dl_pack #7

[Enhancement][Torch] Using torch cpp_extension for kernel integration instead of dl_pack #7

Comments

LeiWang1999 commented Jan 18, 2025

LeiWang1999 commented Jan 18, 2025

LeiWang1999 commented Jan 19, 2025