Use `dpctl.tensor.matmul` in the backend of `dpnp.matmul` when inputs are integer #2296

vtavana · 2025-02-05T19:01:53Z

resolves #2270

OneMath (OneMKL) routines (gemm, gemv, gemm_batch) for matrix multiplication only support floating point data types. If inputs are integer, to use OneMath we need to upcasting them to floating point dtypes, perform the calculation and then cast back the result to integer dtypes which is unsafe and we may loose some information for large integers.
In this PR, the logic for dpnp.matmul is updated to use dpctl.tensor.matmul when result has a integer dtypes.

Performance Analysis

$ sycl-ls
[level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Data Center GPU Max 1100 12.60.7 [1.6.31294+9]
[opencl:cpu][opencl:0] Intel(R) OpenCL, Intel(R) Xeon(R) Platinum 8480+ OpenCL 3.0 (Build 0) [2025.19.1.0.16_160000]
[opencl:gpu][opencl:1] Intel(R) OpenCL Graphics, Intel(R) Data Center GPU Max 1100 OpenCL 3.0 NEO  [24.39.31294]

import numpy, dpnp

# Case 1 - matrix-matrix multiplication
n=1024
a=numpy.ones((n,n), dtype="i8")
%timeit numpy.matmul(a, a)

ia=dpnp.array(a, device="gpu")
%timeit dpnp.matmul(ia, ia); ia.sycl_queue.wait()

# Case 2 - Matrix-vector multiplication
n=4096*4
a=numpy.ones((n,n), dtype="i8")
b=numpy.ones((n,), dtype="i8")
%timeit numpy.matmul(a, b)

ia=dpnp.array(a, device="gpu")
ib=dpnp.array(b, device="gpu")
%timeit dpnp.matmul(ia, ib); ia.sycl_queue.wait()

# Case 3 - Vector-matrix multiplication
n=4096*4
a=numpy.ones((n,n), dtype="i8")
b=numpy.ones((n,), dtype="i8")
%timeit numpy.matmul(b, a)

ia=dpnp.array(a, device="gpu")
ib=dpnp.array(b, device="gpu")
%timeit dpnp.matmul(ib, ia); ia.sycl_queue.wait()

# Case 4 - batch matrix-matrix multiplication
n=256
a=numpy.ones((n,n,n), dtype="i8")
%timeit numpy.matmul(a, a)

ia=dpnp.array(a, device="gpu")
%timeit dpnp.matmul(ia, ia); ia.sycl_queue.wait()

Case # - n	NumPy	dpnp - Xeon	dpnp - PVC
Case1 - 1024	4.98 s ± 867 μs	9.82 ms ± 40.6 μs	3.34 ms ± 4.16 μs
Case2 - 4096×16	4.2 s ± 616 μs	9.77 s ± 112 ms	1.27 s ± 2.04 ms
Case3 - 4096×4	9.6 s ± 3.3 ms	94.6 ms ± 2.15 ms	50.9 ms ± 38.2 μs
Case4 - 256	3.19 s ± 291 μs	48.9 ms ± 244 μs	10.4 ms ± 32.3 μs

For all cases dpnp shows a better performance compared to numpy except for Case 2 on Xeon.

Have you provided a meaningful PR description?
Have you added a test, reproducer or referred to issue with a reproducer?
Have you tested your changes locally for CPU and GPU devices?
Have you made sure that new changes do not introduce compiler warnings?
Have you checked performance impact of proposed changes?
If this PR is a work in progress, are you filing the PR as a draft?

github-actions · 2025-02-05T19:21:46Z

View rendered docs @ https://intelpython.github.io/dpnp/index.html

coveralls · 2025-02-05T19:32:39Z

coverage: 71.629% (+0.06%) from 71.572%
when pulling 1edddbe on fix_issue-2270
into 0c455a6 on master.

github-actions · 2025-02-05T19:45:03Z

Array API standard conformance tests for dpnp=0.17.0dev5=py312he4f9c94_31 ran successfully.
Passed: 971
Failed: 0
Skipped: 29

.github/workflows/conda-package.yml

dpnp/dpnp_utils/dpnp_utils_linearalgebra.py

dpnp/tests/test_product.py

dpnp/dpnp_utils/dpnp_utils_linearalgebra.py

dpnp/tests/test_product.py

antonwolfy

Thank you @vtavana , I leave two more minor comments, but overall LGTM!

dpnp/tests/test_product.py

Co-authored-by: Anton <[email protected]>

… are integer (#2296) resolves #2270 OneMath (OneMKL) routines (`gemm`, `gemv`, `gemm_batch`) for matrix multiplication only support floating point data types. If inputs are integer, to use OneMath we need to upcasting them to floating point dtypes, perform the calculation and then cast back the result to integer dtypes which is unsafe and we may loose some information for large integers. In this PR, the logic for `dpnp.matmul` is updated to use `dpctl.tensor.matmul` when result has a integer dtypes. db97d59

fix issue 2270

af24ae4

vtavana self-assigned this Feb 5, 2025

vtavana and others added 2 commits February 5, 2025 14:29

increase linux test timeout

1add60a

Merge branch 'master' into fix_issue-2270

7698f84

vtavana marked this pull request as ready for review February 6, 2025 00:47

vtavana requested review from antonwolfy, AlexanderKalistratov and vlad-perevezentsev as code owners February 6, 2025 00:47

antonwolfy reviewed Feb 6, 2025

View reviewed changes

.github/workflows/conda-package.yml Show resolved Hide resolved

dpnp/dpnp_utils/dpnp_utils_linearalgebra.py Show resolved Hide resolved

dpnp/tests/test_product.py Outdated Show resolved Hide resolved

dpnp/tests/test_product.py Outdated Show resolved Hide resolved

vtavana and others added 3 commits February 6, 2025 12:45

address comments

9d2bc00

Merge branch 'master' into fix_issue-2270

41420f2

updates for onemkl interfaces

161c617

antonwolfy reviewed Feb 7, 2025

View reviewed changes

dpnp/dpnp_utils/dpnp_utils_linearalgebra.py Outdated Show resolved Hide resolved

dpnp/dpnp_utils/dpnp_utils_linearalgebra.py Show resolved Hide resolved

dpnp/dpnp_utils/dpnp_utils_linearalgebra.py Outdated Show resolved Hide resolved

dpnp/tests/test_product.py Outdated Show resolved Hide resolved

address new comments

6cc6dee

antonwolfy approved these changes Feb 7, 2025

View reviewed changes

dpnp/tests/test_product.py Outdated Show resolved Hide resolved

dpnp/tests/test_product.py Outdated Show resolved Hide resolved

vtavana and others added 2 commits February 7, 2025 13:45

Apply suggestions from code review

2d41e15

Co-authored-by: Anton <[email protected]>

increase timeout for windows

1edddbe

vtavana merged commit db97d59 into master Feb 7, 2025
66 of 69 checks passed

vtavana deleted the fix_issue-2270 branch February 7, 2025 22:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `dpctl.tensor.matmul` in the backend of `dpnp.matmul` when inputs are integer #2296

Use `dpctl.tensor.matmul` in the backend of `dpnp.matmul` when inputs are integer #2296

vtavana commented Feb 5, 2025 •

edited

Loading

github-actions bot commented Feb 5, 2025 •

edited

Loading

coveralls commented Feb 5, 2025 •

edited

Loading

github-actions bot commented Feb 5, 2025 •

edited

Loading

antonwolfy left a comment

Use dpctl.tensor.matmul in the backend of dpnp.matmul when inputs are integer #2296

Use dpctl.tensor.matmul in the backend of dpnp.matmul when inputs are integer #2296

Conversation

vtavana commented Feb 5, 2025 • edited Loading

github-actions bot commented Feb 5, 2025 • edited Loading

coveralls commented Feb 5, 2025 • edited Loading

github-actions bot commented Feb 5, 2025 • edited Loading

antonwolfy left a comment

Choose a reason for hiding this comment

Use `dpctl.tensor.matmul` in the backend of `dpnp.matmul` when inputs are integer #2296

Use `dpctl.tensor.matmul` in the backend of `dpnp.matmul` when inputs are integer #2296

vtavana commented Feb 5, 2025 •

edited

Loading

github-actions bot commented Feb 5, 2025 •

edited

Loading

coveralls commented Feb 5, 2025 •

edited

Loading

github-actions bot commented Feb 5, 2025 •

edited

Loading