【Hackathon 8th No.2】为 Paddle 新增 baddbmm API -part #70757

Qin-sx · 2025-01-09T13:20:37Z

PR Category

User Experience

PR Types

New features

Description

https://github.com/PaddlePaddle/community/blob/master/hackathon/hackathon_8th/%E3%80%90Hackathon_8th%E3%80%91%E4%B8%AA%E4%BA%BA%E6%8C%91%E6%88%98%E8%B5%9B%E2%80%94%E6%A1%86%E6%9E%B6%E5%BC%80%E5%8F%91%E4%BB%BB%E5%8A%A1%E5%90%88%E9%9B%86.md#no2-%E4%B8%BA-paddle-%E6%96%B0%E5%A2%9E-baddbmm-api

新增加了float16和bfloat16的特例化

RFC文档：【Hackathon 8th No.2】为 Paddle 新增 baddbmm API community#1051
中文文档：【Hackathon 8th No.2】为 Paddle 新增 baddbmm API docs#7041

modified: paddle/fluid/pybind/pybind.cc modified: paddle/phi/core/memory/stats.cc modified: paddle/phi/core/memory/stats.h modified: python/paddle/device/cuda/__init__.py

modified: paddle/fluid/pybind/pybind.cc modified: python/paddle/device/cuda/__init__.py

modified: paddle/fluid/pybind/pybind.cc modified: paddle/phi/core/memory/stats.cc modified: paddle/phi/core/memory/stats.h modified: test/cpp/fluid/memory/stats_test.cc

new file: test/legacy_test/test_cuda_memory_stats.py new file: test/legacy_test/test_cuda_reset_peak_memory_stats.py

new file: test/legacy_test/test_cuda_reset_max_memory_allocated.py

modified: python/paddle/device/cuda/__init__.py modified: test/legacy_test/test_cuda_memory_stats.py modified: test/legacy_test/test_cuda_reset_max_memory_allocated.py modified: test/legacy_test/test_cuda_reset_peak_memory_stats.py

modified: paddle/fluid/pybind/pybind.cc modified: paddle/phi/core/memory/stats.cc modified: paddle/phi/core/memory/stats.h modified: test/cpp/fluid/memory/stats_test.cc

modified: python/paddle/device/cuda/__init__.py modified: test/legacy_test/test_cuda_reset_max_memory_allocated.py new file: test/legacy_test/test_cuda_reset_max_memory_reserved.py

modified: paddle/fluid/pybind/pybind.cc modified: python/paddle/device/cuda/__init__.py deleted: test/legacy_test/test_cuda_memory_stats.py deleted: test/legacy_test/test_cuda_reset_peak_memory_stats.py

modified: paddle/phi/infermeta/ternary.cc modified: paddle/phi/infermeta/ternary.h new file: paddle/phi/kernels/baddbmm_grad_kernel.h new file: paddle/phi/kernels/baddbmm_kernel.h new file: paddle/phi/kernels/cpu/baddbmm_grad_kernel.cc new file: paddle/phi/kernels/cpu/baddbmm_kernel.cc modified: paddle/phi/kernels/funcs/blas/blas.h modified: paddle/phi/kernels/funcs/blas/blas_impl.cu.h modified: paddle/phi/kernels/funcs/blas/blas_impl.h new file: paddle/phi/kernels/gpu/baddbmm_grad_kernel.cu new file: paddle/phi/kernels/gpu/baddbmm_kernel.cu new file: paddle/phi/kernels/impl/baddbmm_grad_kernel_impl.h new file: paddle/phi/kernels/impl/baddbmm_kernel_impl.h modified: paddle/phi/ops/yaml/backward.yaml modified: paddle/phi/ops/yaml/ops.yaml modified: python/paddle/__init__.py modified: python/paddle/tensor/__init__.py modified: python/paddle/tensor/math.py

modified: paddle/fluid/pir/dialect/op_generator/decomp_interface_gen_op_list.py modified: paddle/fluid/pir/dialect/operator/interface/infer_symbolic_shape/multiary_infer_sym.cc modified: paddle/fluid/pir/dialect/operator/interface/infer_symbolic_shape/multiary_infer_sym.h modified: paddle/fluid/primitive/decomp_rule/decomp_rule/composite.h modified: paddle/phi/api/ext/tensor_compat.h modified: paddle/phi/kernels/impl/baddbmm_grad_kernel_impl.h modified: paddle/phi/ops/yaml/op_compat.yaml new file: test/legacy_test/test_baddbmm_op.py

paddle-bot · 2025-01-09T13:20:43Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

modified: ../paddle/phi/kernels/funcs/blas/blas_impl.cu.h modified: ../paddle/phi/kernels/impl/baddbmm_kernel_impl.h

modified: ../paddle/phi/ops/yaml/ops.yaml

modified: ../test/legacy_test/test_baddbmm_op.py

modified: ../test/legacy_test/test_baddbmm_op.py new file: ../test/legacy_test/test_baddbmm_op1.py new file: ../test/legacy_test/test_baddbmm_op2.py new file: ../test/legacy_test/test_baddbmm_op3.py

modified: ../test/legacy_test/test_baddbmm_op1.py

modified: ../paddle/phi/api/ext/tensor_compat.h

modified: ../paddle/phi/kernels/funcs/blas/blas_impl.hip.h

modified: ../test/legacy_test/test_baddbmm_op3.py new file: ../test/legacy_test/test_baddbmm_op4.py

Qin-sx · 2025-01-17T02:28:06Z

辛苦在PR描述中补充精度验证代码和精度误差

验证代码添加在了RFC文档中。

以下为测试结果：

float32类型

max diff between paddle matmul and torch baddbmm is 0.0021514892578125
max diff between paddle baddbmm and torch baddbmm is 0.0
max diff between paddle baddbmm and paddle matmul is 0.0021514892578125

float16类型

max diff between paddle matmul and torch baddbmm is 0.25
max diff between paddle baddbmm and torch baddbmm is 0.0
max diff between paddle baddbmm and paddle matmul is 0.25

bfloat16类型

max diff between paddle matmul and torch baddbmm is 2.0
max diff between paddle baddbmm and torch baddbmm is 0.0
max diff between paddle baddbmm and paddle matmul is 2.0

Qin-sx · 2025-01-17T12:16:46Z

RFC文档：【Hackathon 8th No.2】为 Paddle 新增 baddbmm API community#1051
API文档：【Hackathon 8th No.2】为 Paddle 新增 baddbmm API docs#7041

modified: paddle/fluid/pir/dialect/operator/interface/infer_symbolic_shape/multiary_infer_sym.cc modified: paddle/phi/infermeta/ternary.cc modified: paddle/phi/kernels/impl/baddbmm_grad_kernel_impl.h modified: paddle/phi/kernels/impl/baddbmm_kernel_impl.h modified: python/paddle/tensor/math.py modified: test/legacy_test/CMakeLists.txt new file: test/legacy_test/test_baddbmm_op6.py

modified: ../paddle/phi/infermeta/ternary.cc modified: ../test/legacy_test/test_baddbmm_op3.py

modified: ../python/paddle/tensor/math.py new file: ../test/legacy_test/test_baddbmm_op7.py

modified: ../test/legacy_test/CMakeLists.txt modified: ../test/legacy_test/test_baddbmm_op7.py

modified: paddle/phi/kernels/funcs/blas/blas_impl.cu.h

Qin-sx · 2025-01-23T01:35:47Z

关于PR-CI-Coverage没有覆盖到的两部分的相关issue
#70848
#70888

DrownFish19 · 2025-01-23T03:18:53Z

python/paddle/tensor/math.py

+                raise ValueError(
+                    f"If input's dimension[2] is not equal to y's dimension[2], input's dimension[2] must be 1. But received input's dimension[2] = {input_shape[2]}, y's dimension[2] = {y_shape[2]}"
+                )
+            # 上面的判断已经包含了这种情况


注释如不必要，可以删除

收到，已修改，谢谢

DrownFish19 · 2025-01-23T03:19:09Z

python/paddle/tensor/math.py

+                raise ValueError(
+                    f"If input's dimension[2] is not equal to y's dimension[2], input's dimension[2] must be 1. But received input's dimension[2] = {input_shape[2]}, y's dimension[2] = {y_shape[2]}"
+                )
+            # 上面的判断已经包含了这种情况


注释问题

收到，已修改，谢谢

modified: ../paddle/phi/api/ext/tensor_compat.h modified: ../python/paddle/tensor/math.py

DrownFish19 · 2025-02-06T03:53:52Z

QA已经测试，头文件覆盖率确实不会显示，可豁免

DrownFish19

LGTM

jeff41404 · 2025-02-07T06:56:27Z

python/paddle/tensor/math.py

+    name: str | None = None,
+) -> Tensor:
+    """
+    Inplace version of ``baddbmm`` API, the output Tensor will be inplaced with input ``x``.


the output Tensor will be inplaced with input input ? not x?

I referred to the addmm_ function before.

""" Inplace version of ``addmm`` API, the output Tensor will be inplaced with input ``x``. """

Should the output Tensor be inplace updated to input or x?
Or should I change the comments for the functions addmm and baddmm to

""" , the output Tensor will be inplaced with input ``input ``. """

inplace means that the output needs to be written to a certain input(using the space of a certain input to save memory usage), which is determined by the yaml configuration. look to the configuration of addmm, when use addmm_, its output is written to a input parameter of input, the document of addmm_ needs to be modified correctly. so does baddbmm_ .

I have modified them. Thank you.

jeff41404 · 2025-02-07T07:09:30Z

test/legacy_test/CMakeLists.txt

+set_tests_properties(test_baddbmm_op PROPERTIES TIMEOUT 120)
+set_tests_properties(test_baddbmm_op1 PROPERTIES TIMEOUT 120)
+set_tests_properties(test_baddbmm_op2 PROPERTIES TIMEOUT 120)
+set_tests_properties(test_baddbmm_op3 PROPERTIES TIMEOUT 120)
+set_tests_properties(test_baddbmm_op4 PROPERTIES TIMEOUT 120)
+set_tests_properties(test_baddbmm_op5 PROPERTIES TIMEOUT 120)
+set_tests_properties(test_baddbmm_op6 PROPERTIES TIMEOUT 120)
+set_tests_properties(test_baddbmm_op7 PROPERTIES TIMEOUT 120)


Why need 8 unit test files? Can we merge them all into test_baddbmm_op.py

I have merged the tests into a single file.
Since I was not familiar with the timeout settings for tests before, I couldn't pass the tests in the cloud environment. Therefore, I split them into multiple files during debugging.

modified: test/legacy_test/test_baddbmm_op.py

modified: test/legacy_test/CMakeLists.txt deleted: test/legacy_test/test_baddbmm_op1.py deleted: test/legacy_test/test_baddbmm_op2.py deleted: test/legacy_test/test_baddbmm_op3.py deleted: test/legacy_test/test_baddbmm_op4.py deleted: test/legacy_test/test_baddbmm_op5.py deleted: test/legacy_test/test_baddbmm_op6.py deleted: test/legacy_test/test_baddbmm_op7.py

luotao1 · 2025-02-08T08:15:07Z

test/legacy_test/CMakeLists.txt

@@ -888,6 +888,8 @@ set_tests_properties(test_callback_wandb PROPERTIES TIMEOUT 60)
 set_tests_properties(test_jit_save_load PROPERTIES TIMEOUT 100)
 set_tests_properties(test_pool2d_op PROPERTIES TIMEOUT 120)

+set_tests_properties(test_baddbmm_op PROPERTIES TIMEOUT 900)


从实际测试看，仅需21s，这里设置900s太多了吧。设置50s也够了。

但是感觉这个测试很不稳定，之前我拆分成多个测试时，有时候每个测试都会超过默认的20s。我猜测是不是多个测试或者多个任务，同时在同一个服务器上运行，会互相影响。要不我先改为500s可以吗？

要不我先改为500s可以吗

可以再重新构建下 coverage 流水线，看新一次测试跑多久，然后看改成多少秒。

收到，那我先改为100秒吧。因为之前我在PR-CI-Coverage中也遇到过其他测试超时的情况，需要手动重新构建。我还是感觉测试环境偶尔会不稳定，导致测试耗时增加。

modified: test/legacy_test/CMakeLists.txt

modified: python/paddle/tensor/math.py

jeff41404

LGTM

sunzhongkai588

LGTM for docs。

文档渲染 CI 出了些问题，等合入后在官网再看看效果

Qin-sx added 12 commits December 5, 2024 06:36

added reset peak value initialization

e2299f9

modified: paddle/fluid/pybind/pybind.cc modified: paddle/phi/core/memory/stats.cc modified: paddle/phi/core/memory/stats.h modified: python/paddle/device/cuda/__init__.py

added comments

dc30348

modified: paddle/fluid/pybind/pybind.cc modified: python/paddle/device/cuda/__init__.py

added cpp tests

d081dde

modified: paddle/fluid/pybind/pybind.cc modified: paddle/phi/core/memory/stats.cc modified: paddle/phi/core/memory/stats.h modified: test/cpp/fluid/memory/stats_test.cc

added python tests

35a12c8

new file: test/legacy_test/test_cuda_memory_stats.py new file: test/legacy_test/test_cuda_reset_peak_memory_stats.py

added a python test for reset_max_memory_allocated

cb5036c

new file: test/legacy_test/test_cuda_reset_max_memory_allocated.py

formatted by pre-commit

5d28856

modified: python/paddle/device/cuda/__init__.py modified: test/legacy_test/test_cuda_memory_stats.py modified: test/legacy_test/test_cuda_reset_max_memory_allocated.py modified: test/legacy_test/test_cuda_reset_peak_memory_stats.py

formatted by pre-commit (clang-format)

f6b3d84

modified: paddle/fluid/pybind/pybind.cc modified: paddle/phi/core/memory/stats.cc modified: paddle/phi/core/memory/stats.h modified: test/cpp/fluid/memory/stats_test.cc

added reset max memory reserved function

b6ea9be

modified: python/paddle/device/cuda/__init__.py modified: test/legacy_test/test_cuda_reset_max_memory_allocated.py new file: test/legacy_test/test_cuda_reset_max_memory_reserved.py

deleted memory stats and reset peak memory stats

caad368

modified: paddle/fluid/pybind/pybind.cc modified: python/paddle/device/cuda/__init__.py deleted: test/legacy_test/test_cuda_memory_stats.py deleted: test/legacy_test/test_cuda_reset_peak_memory_stats.py

Merge branch 'develop' of https://github.com/Qin-sx/Paddle into develop

299a7c0

paddle-bot bot added the contributor External developers label Jan 9, 2025

luotao1 added the PaddlePaddle Hackathon label Jan 10, 2025

luotao1 self-assigned this Jan 10, 2025

Qin-sx and others added 14 commits January 10, 2025 14:35

added bloat16 case

fdf6322

modified: ../paddle/phi/kernels/funcs/blas/blas_impl.cu.h modified: ../paddle/phi/kernels/impl/baddbmm_kernel_impl.h

added InferSymbolicShapeInterface

ed02d8d

modified: ../paddle/phi/ops/yaml/ops.yaml

added more tests

2b3ed56

modified: ../test/legacy_test/test_baddbmm_op.py

tests overtime (20s), reduced size

47de1e6

modified: ../test/legacy_test/test_baddbmm_op.py

test overtime again, reduced size again

da9467b

modified: ../test/legacy_test/test_baddbmm_op.py

reduce size for overtime again

85bb2f5

modified: ../test/legacy_test/test_baddbmm_op.py

just one test for overtime

39ae437

modified: ../test/legacy_test/test_baddbmm_op.py

divided tests in some files

c812aa0

modified: ../test/legacy_test/test_baddbmm_op.py new file: ../test/legacy_test/test_baddbmm_op1.py new file: ../test/legacy_test/test_baddbmm_op2.py new file: ../test/legacy_test/test_baddbmm_op3.py

enable static

284e601

modified: ../test/legacy_test/test_baddbmm_op1.py

deleted baddbmm in tensor_compat.h

c18014a

modified: ../paddle/phi/api/ext/tensor_compat.h

added float in hip

ebc6f56

modified: ../paddle/phi/kernels/funcs/blas/blas_impl.hip.h

pre-commit

cd54e10

modified: ../paddle/phi/kernels/funcs/blas/blas_impl.hip.h

added more tests

8afaf6a

modified: ../test/legacy_test/test_baddbmm_op3.py new file: ../test/legacy_test/test_baddbmm_op4.py

Merge branch 'PaddlePaddle:develop' into develop

306df76

Qin-sx added 6 commits January 19, 2025 00:10

added more tests

d06509b

modified: ../paddle/phi/infermeta/ternary.cc modified: ../test/legacy_test/test_baddbmm_op3.py

added more tests for baddbmm_

5c966a1

modified: ../python/paddle/tensor/math.py new file: ../test/legacy_test/test_baddbmm_op7.py

added more tests

c5598a5

modified: ../test/legacy_test/CMakeLists.txt modified: ../test/legacy_test/test_baddbmm_op7.py

typo

f5f1e38

modified: paddle/phi/kernels/funcs/blas/blas_impl.cu.h

Merge branch 'develop' into develop_baddbmm_format

d572d4d

DrownFish19 reviewed Jan 23, 2025

View reviewed changes

deleted comments and 'print'

9c6ac6a

modified: ../paddle/phi/api/ext/tensor_compat.h modified: ../python/paddle/tensor/math.py

luotao1 assigned jeff41404 and sunzhongkai588 Feb 6, 2025

DrownFish19 approved these changes Feb 7, 2025

View reviewed changes

jeff41404 reviewed Feb 7, 2025

View reviewed changes

Qin-sx added 2 commits February 7, 2025 21:34

merged unit tests into one file

f1bd7be

modified: test/legacy_test/test_baddbmm_op.py

Qin-sx force-pushed the develop_baddbmm_format branch from 1133b87 to 11b3f45 Compare February 7, 2025 14:22

luotao1 added the API label Feb 8, 2025

luotao1 reviewed Feb 8, 2025

View reviewed changes

Qin-sx added 2 commits February 8, 2025 19:43

reduced timeout

02690cc

modified: test/legacy_test/CMakeLists.txt

modified comments for inplace functions

180653a

modified: python/paddle/tensor/math.py

jeff41404 approved these changes Feb 11, 2025

View reviewed changes

sunzhongkai588 approved these changes Feb 11, 2025

View reviewed changes

luotao1 merged commit 02abd69 into PaddlePaddle:develop Feb 11, 2025
31 checks passed

luotao1 changed the title ~~【Hackathon 8th No.2】为 Paddle 新增 baddbmm API~~ 【Hackathon 8th No.2】为 Paddle 新增 baddbmm API -part Feb 11, 2025

luotao1 mentioned this pull request Mar 5, 2025

【Hackathon 8th】开源贡献个人挑战赛 #71310

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Hackathon 8th No.2】为 Paddle 新增 baddbmm API -part #70757

【Hackathon 8th No.2】为 Paddle 新增 baddbmm API -part #70757

Qin-sx commented Jan 9, 2025 •

edited by luotao1

Loading

paddle-bot bot commented Jan 9, 2025

Qin-sx commented Jan 17, 2025

Qin-sx commented Jan 17, 2025

Qin-sx commented Jan 23, 2025

DrownFish19 Jan 23, 2025

Qin-sx Jan 23, 2025

DrownFish19 Jan 23, 2025

Qin-sx Jan 23, 2025

DrownFish19 commented Feb 6, 2025 •

edited by luotao1

Loading

DrownFish19 left a comment

jeff41404 Feb 7, 2025

Qin-sx Feb 7, 2025

jeff41404 Feb 10, 2025 •

edited

Loading

Qin-sx Feb 10, 2025

jeff41404 Feb 7, 2025

Qin-sx Feb 7, 2025

luotao1 Feb 8, 2025

Qin-sx Feb 8, 2025

luotao1 Feb 8, 2025

Qin-sx Feb 8, 2025

jeff41404 left a comment

sunzhongkai588 left a comment

【Hackathon 8th No.2】为 Paddle 新增 baddbmm API -part #70757

【Hackathon 8th No.2】为 Paddle 新增 baddbmm API -part #70757

Conversation

Qin-sx commented Jan 9, 2025 • edited by luotao1 Loading

PR Category

PR Types

Description

paddle-bot bot commented Jan 9, 2025

Qin-sx commented Jan 17, 2025

Qin-sx commented Jan 17, 2025

Qin-sx commented Jan 23, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DrownFish19 commented Feb 6, 2025 • edited by luotao1 Loading

DrownFish19 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeff41404 Feb 10, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeff41404 left a comment

Choose a reason for hiding this comment

sunzhongkai588 left a comment

Choose a reason for hiding this comment

Qin-sx commented Jan 9, 2025 •

edited by luotao1

Loading

DrownFish19 commented Feb 6, 2025 •

edited by luotao1

Loading

jeff41404 Feb 10, 2025 •

edited

Loading