Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon 8th No.2】为 Paddle 新增 baddbmm API -part #70757

Merged
merged 42 commits into from
Feb 11, 2025

Conversation

Qin-sx added 12 commits December 5, 2024 06:36
	modified:   paddle/fluid/pybind/pybind.cc
	modified:   paddle/phi/core/memory/stats.cc
	modified:   paddle/phi/core/memory/stats.h
	modified:   python/paddle/device/cuda/__init__.py
	modified:   paddle/fluid/pybind/pybind.cc
	modified:   python/paddle/device/cuda/__init__.py
	modified:   paddle/fluid/pybind/pybind.cc
	modified:   paddle/phi/core/memory/stats.cc
	modified:   paddle/phi/core/memory/stats.h
	modified:   test/cpp/fluid/memory/stats_test.cc
	new file:   test/legacy_test/test_cuda_memory_stats.py
	new file:   test/legacy_test/test_cuda_reset_peak_memory_stats.py
	new file:   test/legacy_test/test_cuda_reset_max_memory_allocated.py
	modified:   python/paddle/device/cuda/__init__.py
	modified:   test/legacy_test/test_cuda_memory_stats.py
	modified:   test/legacy_test/test_cuda_reset_max_memory_allocated.py
	modified:   test/legacy_test/test_cuda_reset_peak_memory_stats.py
	modified:   paddle/fluid/pybind/pybind.cc
	modified:   paddle/phi/core/memory/stats.cc
	modified:   paddle/phi/core/memory/stats.h
	modified:   test/cpp/fluid/memory/stats_test.cc
	modified:   python/paddle/device/cuda/__init__.py
	modified:   test/legacy_test/test_cuda_reset_max_memory_allocated.py
	new file:   test/legacy_test/test_cuda_reset_max_memory_reserved.py
	modified:   paddle/fluid/pybind/pybind.cc
	modified:   python/paddle/device/cuda/__init__.py
	deleted:    test/legacy_test/test_cuda_memory_stats.py
	deleted:    test/legacy_test/test_cuda_reset_peak_memory_stats.py
	modified:   paddle/phi/infermeta/ternary.cc
	modified:   paddle/phi/infermeta/ternary.h
	new file:   paddle/phi/kernels/baddbmm_grad_kernel.h
	new file:   paddle/phi/kernels/baddbmm_kernel.h
	new file:   paddle/phi/kernels/cpu/baddbmm_grad_kernel.cc
	new file:   paddle/phi/kernels/cpu/baddbmm_kernel.cc
	modified:   paddle/phi/kernels/funcs/blas/blas.h
	modified:   paddle/phi/kernels/funcs/blas/blas_impl.cu.h
	modified:   paddle/phi/kernels/funcs/blas/blas_impl.h
	new file:   paddle/phi/kernels/gpu/baddbmm_grad_kernel.cu
	new file:   paddle/phi/kernels/gpu/baddbmm_kernel.cu
	new file:   paddle/phi/kernels/impl/baddbmm_grad_kernel_impl.h
	new file:   paddle/phi/kernels/impl/baddbmm_kernel_impl.h
	modified:   paddle/phi/ops/yaml/backward.yaml
	modified:   paddle/phi/ops/yaml/ops.yaml
	modified:   python/paddle/__init__.py
	modified:   python/paddle/tensor/__init__.py
	modified:   python/paddle/tensor/math.py
	modified:   paddle/fluid/pir/dialect/op_generator/decomp_interface_gen_op_list.py
	modified:   paddle/fluid/pir/dialect/operator/interface/infer_symbolic_shape/multiary_infer_sym.cc
	modified:   paddle/fluid/pir/dialect/operator/interface/infer_symbolic_shape/multiary_infer_sym.h
	modified:   paddle/fluid/primitive/decomp_rule/decomp_rule/composite.h
	modified:   paddle/phi/api/ext/tensor_compat.h
	modified:   paddle/phi/kernels/impl/baddbmm_grad_kernel_impl.h
	modified:   paddle/phi/ops/yaml/op_compat.yaml
	new file:   test/legacy_test/test_baddbmm_op.py
Copy link

paddle-bot bot commented Jan 9, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Jan 9, 2025
@luotao1 luotao1 self-assigned this Jan 10, 2025
Qin-sx and others added 14 commits January 10, 2025 14:35
	modified:   ../paddle/phi/kernels/funcs/blas/blas_impl.cu.h
	modified:   ../paddle/phi/kernels/impl/baddbmm_kernel_impl.h
	modified:   ../paddle/phi/ops/yaml/ops.yaml
	modified:   ../test/legacy_test/test_baddbmm_op.py
	modified:   ../test/legacy_test/test_baddbmm_op.py
	modified:   ../test/legacy_test/test_baddbmm_op.py
	modified:   ../test/legacy_test/test_baddbmm_op.py
	modified:   ../test/legacy_test/test_baddbmm_op.py
	modified:   ../test/legacy_test/test_baddbmm_op.py
	new file:   ../test/legacy_test/test_baddbmm_op1.py
	new file:   ../test/legacy_test/test_baddbmm_op2.py
	new file:   ../test/legacy_test/test_baddbmm_op3.py
	modified:   ../test/legacy_test/test_baddbmm_op1.py
	modified:   ../paddle/phi/api/ext/tensor_compat.h
	modified:   ../paddle/phi/kernels/funcs/blas/blas_impl.hip.h
	modified:   ../paddle/phi/kernels/funcs/blas/blas_impl.hip.h
	modified:   ../test/legacy_test/test_baddbmm_op3.py
	new file:   ../test/legacy_test/test_baddbmm_op4.py
@Qin-sx
Copy link
Contributor Author

Qin-sx commented Jan 17, 2025

辛苦在PR描述中补充精度验证代码和精度误差

验证代码添加在了RFC文档中。

以下为测试结果:

  1. float32类型
max diff between paddle matmul and torch baddbmm is 0.0021514892578125
max diff between paddle baddbmm and torch baddbmm is 0.0
max diff between paddle baddbmm and paddle matmul is 0.0021514892578125
  1. float16类型
max diff between paddle matmul and torch baddbmm is 0.25
max diff between paddle baddbmm and torch baddbmm is 0.0
max diff between paddle baddbmm and paddle matmul is 0.25
  1. bfloat16类型
max diff between paddle matmul and torch baddbmm is 2.0
max diff between paddle baddbmm and torch baddbmm is 0.0
max diff between paddle baddbmm and paddle matmul is 2.0

@Qin-sx
Copy link
Contributor Author

Qin-sx commented Jan 17, 2025

	modified:   paddle/fluid/pir/dialect/operator/interface/infer_symbolic_shape/multiary_infer_sym.cc
	modified:   paddle/phi/infermeta/ternary.cc
	modified:   paddle/phi/kernels/impl/baddbmm_grad_kernel_impl.h
	modified:   paddle/phi/kernels/impl/baddbmm_kernel_impl.h
	modified:   python/paddle/tensor/math.py
	modified:   test/legacy_test/CMakeLists.txt
	new file:   test/legacy_test/test_baddbmm_op6.py
	modified:   ../paddle/phi/infermeta/ternary.cc
	modified:   ../test/legacy_test/test_baddbmm_op3.py
	modified:   ../python/paddle/tensor/math.py
	new file:   ../test/legacy_test/test_baddbmm_op7.py
	modified:   ../test/legacy_test/CMakeLists.txt
	modified:   ../test/legacy_test/test_baddbmm_op7.py
	modified:   paddle/phi/kernels/funcs/blas/blas_impl.cu.h
@Qin-sx
Copy link
Contributor Author

Qin-sx commented Jan 23, 2025

关于PR-CI-Coverage没有覆盖到的两部分的相关issue
#70848
#70888

raise ValueError(
f"If input's dimension[2] is not equal to y's dimension[2], input's dimension[2] must be 1. But received input's dimension[2] = {input_shape[2]}, y's dimension[2] = {y_shape[2]}"
)
# 上面的判断已经包含了这种情况
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注释如不必要,可以删除

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

收到,已修改,谢谢

raise ValueError(
f"If input's dimension[2] is not equal to y's dimension[2], input's dimension[2] must be 1. But received input's dimension[2] = {input_shape[2]}, y's dimension[2] = {y_shape[2]}"
)
# 上面的判断已经包含了这种情况
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注释问题

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

收到,已修改,谢谢

	modified:   ../paddle/phi/api/ext/tensor_compat.h
	modified:   ../python/paddle/tensor/math.py
@DrownFish19
Copy link
Contributor

DrownFish19 commented Feb 6, 2025

QA已经测试,头文件覆盖率确实不会显示,可豁免

Copy link
Contributor

@DrownFish19 DrownFish19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

name: str | None = None,
) -> Tensor:
"""
Inplace version of ``baddbmm`` API, the output Tensor will be inplaced with input ``x``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the output Tensor will be inplaced with input input ? not x?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I referred to the addmm_ function before.

"""
Inplace version of ``addmm`` API, the output Tensor will be inplaced with input ``x``.
"""

Should the output Tensor be inplace updated to input or x?
Or should I change the comments for the functions addmm and baddmm to

"""
, the output Tensor will be inplaced with input ``input ``.
"""

Copy link
Contributor

@jeff41404 jeff41404 Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inplace means that the output needs to be written to a certain input(using the space of a certain input to save memory usage), which is determined by the yaml configuration. look to the configuration of addmm, when use addmm_, its output is written to a input parameter of input, the document of addmm_ needs to be modified correctly. so does baddbmm_ .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have modified them. Thank you.

Comment on lines 891 to 898
set_tests_properties(test_baddbmm_op PROPERTIES TIMEOUT 120)
set_tests_properties(test_baddbmm_op1 PROPERTIES TIMEOUT 120)
set_tests_properties(test_baddbmm_op2 PROPERTIES TIMEOUT 120)
set_tests_properties(test_baddbmm_op3 PROPERTIES TIMEOUT 120)
set_tests_properties(test_baddbmm_op4 PROPERTIES TIMEOUT 120)
set_tests_properties(test_baddbmm_op5 PROPERTIES TIMEOUT 120)
set_tests_properties(test_baddbmm_op6 PROPERTIES TIMEOUT 120)
set_tests_properties(test_baddbmm_op7 PROPERTIES TIMEOUT 120)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need 8 unit test files? Can we merge them all into test_baddbmm_op.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have merged the tests into a single file.
Since I was not familiar with the timeout settings for tests before, I couldn't pass the tests in the cloud environment. Therefore, I split them into multiple files during debugging.

	modified:   test/legacy_test/test_baddbmm_op.py
	modified:   test/legacy_test/CMakeLists.txt
	deleted:    test/legacy_test/test_baddbmm_op1.py
	deleted:    test/legacy_test/test_baddbmm_op2.py
	deleted:    test/legacy_test/test_baddbmm_op3.py
	deleted:    test/legacy_test/test_baddbmm_op4.py
	deleted:    test/legacy_test/test_baddbmm_op5.py
	deleted:    test/legacy_test/test_baddbmm_op6.py
	deleted:    test/legacy_test/test_baddbmm_op7.py
@Qin-sx Qin-sx force-pushed the develop_baddbmm_format branch from 1133b87 to 11b3f45 Compare February 7, 2025 14:22
@luotao1 luotao1 added the API label Feb 8, 2025
@@ -888,6 +888,8 @@ set_tests_properties(test_callback_wandb PROPERTIES TIMEOUT 60)
set_tests_properties(test_jit_save_load PROPERTIES TIMEOUT 100)
set_tests_properties(test_pool2d_op PROPERTIES TIMEOUT 120)

set_tests_properties(test_baddbmm_op PROPERTIES TIMEOUT 900)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image 从实际测试看,仅需21s,这里设置900s太多了吧。设置50s也够了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

但是感觉这个测试很不稳定,之前我拆分成多个测试时,有时候每个测试都会超过默认的20s。我猜测是不是多个测试或者多个任务,同时在同一个服务器上运行,会互相影响。要不我先改为500s可以吗?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

要不我先改为500s可以吗

可以再重新构建下 coverage 流水线,看新一次测试跑多久,然后看改成多少秒。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

收到,那我先改为100秒吧。因为之前我在PR-CI-Coverage中也遇到过其他测试超时的情况,需要手动重新构建。我还是感觉测试环境偶尔会不稳定,导致测试耗时增加。

	modified:   test/legacy_test/CMakeLists.txt
	modified:   python/paddle/tensor/math.py
Copy link
Contributor

@jeff41404 jeff41404 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@sunzhongkai588 sunzhongkai588 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for docs。

文档渲染 CI 出了些问题,等合入后在官网再看看效果

@luotao1 luotao1 merged commit 02abd69 into PaddlePaddle:develop Feb 11, 2025
31 checks passed
@luotao1 luotao1 changed the title 【Hackathon 8th No.2】为 Paddle 新增 baddbmm API 【Hackathon 8th No.2】为 Paddle 新增 baddbmm API -part Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants