-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【快乐开源】Paddle Tensor 规范化二期 API 支持 0-size Tensor No.13-17:paddle.mean、paddle.sum、paddle.prod、paddle.var、paddle.std #71504
Open
cangtianhuang
wants to merge
15
commits into
PaddlePaddle:develop
Choose a base branch
from
cangtianhuang:support-0-size
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
你的PR提交成功,感谢你对开源项目的贡献! |
python侧的修改降低了api性能…… |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR Category
Operator Mechanism
PR Types
Bug fixes
Description
该 PR 主要修改了
paddle.mean
、paddle.sum
、paddle.prod
、paddle.var
、paddle.std
等reduce
方法的相关代码与内核,使其支持处理 0-size Tensor 。C++ 侧修改了多个前向 kernel,但未修改反向 kernel ;Python 侧在 var 方法处做了一个巧妙的小修改。期望的值:("mean", float("nan")), ("prod", 1), ("std", float("nan")), ("sum", 0), ("var", float("nan")),
具体修改如下:
1.
paddle.mean
修改了 phi 库的 C++ 前向 kernel,包括 cpu、kps、onednn、xpu 等多设备。为了增强算子 kernel 的复用,修改集中在了底层的
MeanRawKernel
处。在
paddle\phi\kernels\cpu\reduce_mean_kernel.cc
、paddle\phi\kernels\kps\reduce_kernel.cu
、paddle\phi\kernels\onednn\reduce_mean_kernel.cc
、paddle\phi\kernels\xpu\reduce_mean_kernel.cc
中,增加了对 0-size Tensor 的处理逻辑。由于输出形状已经由SumRawInferMeta
正确推断并保存在DenseTensor* out
中,因此无需增加额外推断逻辑,直接调用FullKernel
,填充NaN
值并返回。此外 ,onednn 的
FullKernel
(位于paddle\phi\kernels\onednn\full_kernel.cc
)原先不支持phi::dtype::bfloat16
类型,为其增添了注册类型。2.
paddle.sum
同
paddle.mean
修改思路一致,修改了 phi 库的 C++ 前向 kernel,修改集中在SumRawKernel
处。在
paddle\phi\kernels\cpu\reduce_sum_kernel.cc
、paddle\phi\kernels\kps\reduce_kernel.cu
、paddle\phi\kernels\onednn\reduce_sum_kernel.cc
、paddle\phi\kernels\xpu\reduce_sum_kernel.cc
中,增加了对 0-size Tensor 的处理逻辑,直接调用FullKernel
,填充0
值并返回。在 #70379 中为
SumRawKernel
添加了较多的 0-size Tensor 推断逻辑,我认为这是不必要的并删去了冗余部分。3.
paddle.prod
同
paddle.mean
修改思路一致,修改了 phi 库的 C++ 前向 kernel,修改集中在ProdKernel
处。在
paddle\phi\kernels\cpu\prod_kernel.cc
、paddle\phi\kernels\kps\reduce_kernel.cu
、paddle\phi\kernels\xpu\prod_kernel.cc
中(onednn 中没有ProdKernel
),增加了对 0-size Tensor 的处理逻辑,直接调用FullKernel
,填充1
值并返回。4.
paddle.var
在 C++ 侧,
paddle.var
不存在单独的多设备算子,而是由VarianceKernel
执行并调用Mean
、Subtract
、Multiply
、MeanKernel
执行。因此直接在paddle\phi\kernels\reduce_variance_kernel.cc
处增加了对 0-size Tensor 的处理逻辑,调用FullKernel
填充NaN
值并返回。在 Python 侧,在
python\paddle\tensor\stat.py
中为了确保动静图一致,且保持性能不下降,对无偏时n > one_const
处做了一个巧妙的修改,使得当n<=1
时 n 值不变(unbiased = True
时,n=1
无法无偏,n=0
则out/0
得到期望的NaN
值;unbiased = False
时,n=0
out/0
也得到期望的NaN
值),确保形状与数值永远满足期望。5.
paddle.std
在 Python 侧,
paddle.std
调用paddle.var
并开方,对于 0-size TensorNaN
开方后仍旧为NaN
,因此代码不变。经测试,
paddle.mean
、paddle.sum
、paddle.prod
、paddle.var
、paddle.std
通过了 “0-size tensorAPI 支持 0-size Tensor No.13-17” 的测试。且在[], [0,], [0, 3], [2, 0, 4]
等多个 0-size Tensor 形状中表现与预期一致。单测:

自己编写的测试:
