Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cinn trivalop fuse #7

Open
wants to merge 1,060 commits into
base: test_new_group_schedule_tile
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1060 commits
Select commit Hold shift + click to select a range
92f33f9
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
feifei-111 Mar 22, 2024
a7c64ae
DistModel supports feed of list (#62945)
zhiqiu Mar 22, 2024
ea6782a
update
feifei-111 Mar 24, 2024
d7768a7
API improvement nn.functional.group_norm 易用性提升 (#62672)
NKNaN Mar 25, 2024
7af6629
Merge remote-tracking branch 'xiongkun/cinn-trivalop-fuse' into cinn-…
2742195759 Mar 25, 2024
bf1e66f
fix
2742195759 Mar 25, 2024
4768ff6
[OneDNN][PIR] conv elementwise add mkldnn fuse pass (#62713)
zhanglirong1999 Mar 25, 2024
7750ec4
Update errors.cc (#62924)
co63oc Mar 25, 2024
6261015
[Allocator] add new allocator strategy (#62638)
wanghuancoder Mar 25, 2024
6b3f90e
[PIR] A-13 Adapt expand test_errors (#62849)
ooooo-create Mar 25, 2024
129c651
[Inference] auto_mixed_precision_pass supports sparse tensor (#62656)
ming1753 Mar 25, 2024
cb41920
fix code format
2742195759 Mar 25, 2024
d39da6e
Fix enable_host_event_recorder_hook declare (#62921)
co63oc Mar 25, 2024
ac0a57c
【Error Message No.5】paddle/pir/include/* (#62851)
enkilee Mar 25, 2024
00f12db
【Error Message No. 4】 paddle/fluid/pir/transforms/* fix errors (#62840)
enkilee Mar 25, 2024
75f7be5
Update docs of _register_backward_hook (#62926)
MarioLulab Mar 25, 2024
acaf9f5
move port to phi/common/ (#62943)
changeyoung98 Mar 25, 2024
dc9af81
[CINN] support flash attention infer symbol (#62919)
phlrain Mar 25, 2024
a9843c0
fix code format
2742195759 Mar 25, 2024
a34b0a0
add insert broadcast for logical ops (#62985)
zyfncg Mar 25, 2024
d37bd8b
【Error Message No. 34】 fix `CHECK_*` in `paddle/pir` (#62886)
jinyouzhi Mar 25, 2024
285e444
fix small dimensions reduce (#62954)
BiynXu Mar 25, 2024
4836971
[Dy2St] Move `TypeHintTransformer` ahead of `IfElseTransformer` (#62947)
SigureMo Mar 25, 2024
0422de0
update the shape [1] instruction to 0D tensor (#62875)
ooooo-create Mar 25, 2024
177772a
remove unittest
2742195759 Mar 25, 2024
e5e4003
[Prim][PIR]Set rsqrt as primitive op (#62858)
cyber-pioneer Mar 25, 2024
03d85f7
update
feifei-111 Mar 25, 2024
a0439ff
update
feifei-111 Mar 25, 2024
4fce3e6
update (#77)
feifei-111 Mar 25, 2024
b31b61c
Improve the performence of fused api add_double_grad (#62474)
YibinLiu666 Mar 25, 2024
e372701
LayerNorm英文文档修改 (#62928)
1want2sleep Mar 25, 2024
e504f06
[PIR] [DynamicShape] Add infer sym op for pd.conv3d pd.randint pd.ass…
zhangbopd Mar 25, 2024
fba58f5
update
feifei-111 Mar 25, 2024
cbf4df9
update
feifei-111 Mar 25, 2024
b28cbe8
【pir】add ir name for save (#62977)
xiaoguoguo626807 Mar 25, 2024
7d9b987
Implement the composition of maximum_double_grad (#62343)
YibinLiu666 Mar 25, 2024
a7d5ea9
[CINN] replace struct Group with OpLoweringGroup in lower_cinn_fusion…
ZelinMa557 Mar 25, 2024
dddc198
fix
2742195759 Mar 25, 2024
f905ff2
【Hackathon 6th No.24】为 paddle.quantile/nanquantile 功能增强 -part (#62937)
Asthestarsfalll Mar 25, 2024
d648bc7
support skip_check_meta in eval mode of Pipeline (#63001)
haohongxiang Mar 26, 2024
ee570d3
【Error Message No. 27】paddle/cinn/lang/* (#62973)
shuaihehe Mar 26, 2024
f211563
add dist attribute for mutable attribute. (#62897)
winter-wang Mar 26, 2024
76c4514
update
feifei-111 Mar 26, 2024
49070c6
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
feifei-111 Mar 26, 2024
e2e7d98
update rsqrt in decomp (#62999)
cyber-pioneer Mar 26, 2024
365efb4
support_auto_trigger_cmake (#62994)
xuxinyi389 Mar 26, 2024
b0d1ab1
[PIR+CINN]Fix reshape_op nullptr error (#62956)
yulangz Mar 26, 2024
66a4faa
add to whitelist (#62972)
Eddie-Wang1120 Mar 26, 2024
a2bc7b2
update
feifei-111 Mar 26, 2024
9d03d90
update
feifei-111 Mar 26, 2024
c3f5747
[PIR]Store Python data in Operation (#62750)
YuanRisheng Mar 26, 2024
fec0b3d
[CINN / PIR] Cinn trivalop fuse (#62088)
2742195759 Mar 26, 2024
fc3a764
update
feifei-111 Mar 26, 2024
f5a609c
Implement the composition of pow_double_grad (#62338)
YibinLiu666 Mar 26, 2024
b7514c7
optimize composite_double_backward_api.h (#63011)
HydrogenSulfate Mar 26, 2024
6d998d5
use pow instead of elementiwse_pow (#63009)
HydrogenSulfate Mar 26, 2024
f997af4
add R + R reduce
2742195759 Mar 26, 2024
79830a2
fix
2742195759 Mar 26, 2024
8600cba
fix comment in last pr62897. (#63019)
winter-wang Mar 26, 2024
434d641
fix llama postprocess unittest (#63006)
BiynXu Mar 26, 2024
15aad1f
update
feifei-111 Mar 26, 2024
169afa0
[DRR] Add DataType/DataLayoutAttr interface for ResultPattern and add…
yuanlehome Mar 26, 2024
11ba107
【PIR Dist Op Reg No.15】 reg push_dense (#62505)
enkilee Mar 26, 2024
e882803
[PIR][Inference] Add set_optimization_level api (#62885)
bukejiyu Mar 26, 2024
03d28f8
[Dy2St] Increase `test_resnet_amp` ut time to 360s (#62942)
SigureMo Mar 26, 2024
eb6d7b5
[PIR+CINN]Support multi-thread Pre-Compile for Lowering FusionOp (#62…
Aurelius84 Mar 26, 2024
c188d7f
update
feifei-111 Mar 26, 2024
3788887
fix decomp rule (#63020)
cyber-pioneer Mar 26, 2024
f32ce8b
[Inference] Process instance_norm/layer_norm/group_norm input/output …
yuanlehome Mar 26, 2024
b1f0385
new test (#63003)
6clc Mar 26, 2024
564e10d
cinn(op): fix slice symbolic shape (#62997)
6clc Mar 26, 2024
2ff096e
fix bug of symbol expr for group_op is invalid (#63024)
zyfncg Mar 26, 2024
94b6cf4
fix
2742195759 Mar 26, 2024
84a7446
Fix test_fused_weight_only_linear_pass.py (#63038)
yuanlehome Mar 27, 2024
064a998
bug fix for stride_slice when strides < 0 on XPU (#62923)
zhangyk0314 Mar 27, 2024
fa2f03d
update
feifei-111 Mar 27, 2024
6eaa38b
Fix paddle_gtest_main_new dependency (#62969)
co63oc Mar 27, 2024
b2e114f
[PIR+CINN]Open 17 UT for with_cinn=True (#63031)
Aurelius84 Mar 27, 2024
be3cc76
fix fused_conv2d_add_act cutlass kernel dilations check (#63023)
zhink Mar 27, 2024
a63f17c
[CINN]change full with tensor to expand (#63035)
phlrain Mar 27, 2024
9c0cb6c
[Paddle-trt]Convert add trt build phase operator to trt layer log (#6…
lizexu123 Mar 27, 2024
62088cd
Fix _GENERETOR_ _GENERATOR_ (#63037)
co63oc Mar 27, 2024
7b4aa07
fix
2742195759 Mar 27, 2024
1dff8f8
[CINN]shape inference for logsumexp logcumsumexp linspace logspace mi…
Xinyu302 Mar 27, 2024
0f72f40
update
feifei-111 Mar 27, 2024
e8de414
update
feifei-111 Mar 27, 2024
0ac1d11
【PIR OpTest Fix No.33】fix fused_conv2d_add_act (#63005)
MayYouBeProsperous Mar 27, 2024
a4be483
update
feifei-111 Mar 27, 2024
6e6a853
[CINN] Optimize implement of substituting dim expr for broadcast (#63…
zyfncg Mar 27, 2024
d1714d3
【PIR OpTest Fix No.34】 fix test_rank_attention_op (#62900)
CJ77Qi Mar 27, 2024
bd1726c
Modify Logic of SinkTrivialPattern & Node Removal
Mar 27, 2024
5757630
fix (#62965)
enkilee Mar 27, 2024
43a6734
fix
2742195759 Mar 27, 2024
d20ac18
Merge pull request #78 from Fridge003/cinn
2742195759 Mar 27, 2024
326719e
fix infinite loop in trivial fusion
Mar 27, 2024
2060813
fix some bugs
2742195759 Mar 27, 2024
5e55de4
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
2742195759 Mar 27, 2024
664b32f
block group_cluster library in Cmake (#63045)
Mar 27, 2024
2559f5e
Merge pull request #79 from Fridge003/cinn
2742195759 Mar 27, 2024
c607672
fix
2742195759 Mar 27, 2024
f59b508
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
2742195759 Mar 27, 2024
da85c88
fix
2742195759 Mar 27, 2024
62e90a6
update
feifei-111 Mar 27, 2024
f140f1e
[CINN]add Tril(u)Indices shape inference (#63000)
Xinyu302 Mar 27, 2024
377e829
update pr template (#60652)
XieYunshen Mar 27, 2024
e869ea6
fix compile
feifei-111 Mar 27, 2024
a6c6ef7
[CINN]Try to fix build cinn pass (#63047)
phlrain Mar 27, 2024
230cc6a
update
feifei-111 Mar 27, 2024
62e8395
[backends] fix `error_msg` transfer symbol (#63063)
gouzil Mar 28, 2024
bab4534
fix (#63046)
shuaihehe Mar 28, 2024
48e293a
【Error Message No. 31 Part1】fix `CHECK_*` in `paddle/cinn/runtime/` -…
jinyouzhi Mar 28, 2024
9d8b6be
【Error Message No. 31 Part2】fix CHECK_* in paddle/cinn/utils -part (#…
jinyouzhi Mar 28, 2024
eae0281
fix
2742195759 Mar 28, 2024
5fd3319
update
feifei-111 Mar 28, 2024
a60edc0
merge
2742195759 Mar 28, 2024
6a222e4
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
2742195759 Mar 28, 2024
d5863bf
[XPU] AdamW: fp16 for moment1/moment2 (#62688)
houj04 Mar 28, 2024
43df84d
support inserting broadcast for bitwise_and op in cinn (#63058)
zyfncg Mar 28, 2024
b2ac7e1
fix
2742195759 Mar 28, 2024
1d3f608
update
feifei-111 Mar 28, 2024
ede92ce
fix conf
feifei-111 Mar 28, 2024
e4a42f2
merge
2742195759 Mar 28, 2024
9e4f762
support pir apply optimizer in distributed scenario. (#63052)
winter-wang Mar 28, 2024
94a627a
fix
2742195759 Mar 28, 2024
7139309
optimize kunlun200 ci test (#63066)
risemeup1 Mar 28, 2024
34f1fb0
[Prim] Replace math operations with scale (#62916)
HydrogenSulfate Mar 28, 2024
812e616
[CINN] Add symbol info when print group (#63057)
zyfncg Mar 28, 2024
b339294
[CINN Performance] Add CreateSingleOpFallbackToPhiPass (#63060)
jiahy0825 Mar 28, 2024
54cc5e2
[PIR+CINN]Refactor lower_cinn_fusion_op_pass logic (#63050)
Aurelius84 Mar 28, 2024
b1b0726
[CINN] [Test] Set FLAGS_nvrtc_compile_to_cubin=True (#62588)
jiahy0825 Mar 28, 2024
602d2ba
gn decomp rule supports rank 3 (#63056)
cyber-pioneer Mar 28, 2024
8f0e009
update
feifei-111 Mar 28, 2024
4a2138f
fix
2742195759 Mar 28, 2024
67f5f81
merge
2742195759 Mar 28, 2024
75a3f48
[DRR][Inference] Fix a drr rewrite bug, Adjust the order of basic pas…
yuanlehome Mar 28, 2024
c178bda
[PIR+CINN]Fix pd_to_cinn_pass CombineOp verify problem (#63083)
Aurelius84 Mar 28, 2024
e05764a
support flash attention with sparse mask (#62029)
GuoxiaWang Mar 28, 2024
3431f35
fix pir auto parallel bug in mutable attribue. (#63073)
winter-wang Mar 28, 2024
1993810
[CINN Performance] Adjust Spatial Tile Config (#63086)
jiahy0825 Mar 29, 2024
c6891f0
[CINN] Fix bug of cinn pass order (#63095)
zyfncg Mar 29, 2024
fd92d62
update
feifei-111 Mar 29, 2024
08f56c2
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
feifei-111 Mar 29, 2024
70cc347
[pir+auto parallel] add reshard op for input when needed (#63072)
zhiqiu Mar 29, 2024
0ece064
cinn(op): add tril op (#63027)
6clc Mar 29, 2024
a1f42ca
update
feifei-111 Mar 29, 2024
9bd6996
[XPU][PHI Kernels] fused_rotary_position_embedding optimize (#62846)
HarperCy Mar 29, 2024
d8a4c06
fixfix
2742195759 Mar 29, 2024
9c48d52
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
2742195759 Mar 29, 2024
a5e96c6
support expandop for dynamic shape (#80)
Fridge003 Mar 29, 2024
29d88c2
[cmake] support MacOS arm liblapack (#63093)
gouzil Mar 29, 2024
5d8602d
update
feifei-111 Mar 29, 2024
8213876
Fix totaly totally etc, test=document_fix (#63102)
co63oc Mar 29, 2024
d19f29b
[AutoParallel]Refine ShardOptimizer (#62933)
zhangbo9674 Mar 29, 2024
28920ca
[pir+auto parallel] translate reshard_op into comm and compute op (#6…
zhiqiu Mar 29, 2024
7386a65
logging for axes union map (#81)
Fridge003 Mar 29, 2024
351ed7d
[DRR] Support pd_op.scale/pd_op.slice/builtin.slice creation and fix …
yuanlehome Mar 29, 2024
65fae7c
tile and tile_grad support bf16 for xpu (#63075)
zhangyk0314 Mar 29, 2024
0a2e7b6
add complex support for allgather,diag,eye,gather,lookup_table_v2 (#6…
zbt78 Mar 29, 2024
ed19f42
【complex op No.7】add complex support for Log/log10/log2/log1p (#62448)
zbt78 Mar 29, 2024
0ee3369
update
feifei-111 Mar 29, 2024
50983ec
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
feifei-111 Mar 29, 2024
29b2306
refine pir convert_np_dtype_to_dtype_ (#63085)
wanghuancoder Mar 29, 2024
d8f934a
[PIR][oneDNN] Add matmul_activation_fuse_pass (#62901)
LLee233 Mar 29, 2024
1be75ad
PIR supports XPU devices (#63078)
zhink Mar 29, 2024
f23d41e
【Inference PIR】add add_norm_fuse_pass (#63043)
bukejiyu Mar 29, 2024
b69c3dc
[common][PIR] fix `SimplifyErrorTypeFormat` formatting error (#63106)
gouzil Mar 29, 2024
0240c26
[CINN]Delete duplicate calls for index simplification (#63068)
BiynXu Mar 30, 2024
78c6e2e
fix fix
2742195759 Mar 30, 2024
f6a24c1
merge
2742195759 Mar 30, 2024
94e0bd5
Delete .vim_config.yaml
2742195759 Mar 30, 2024
8f7a35d
Delete test/ir/pir/cinn/inference/test_llama_forward.py
2742195759 Mar 30, 2024
73eabd3
fix ckae
2742195759 Mar 30, 2024
25af6e4
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
2742195759 Mar 30, 2024
4eb09f5
Revert "Delete test/ir/pir/cinn/inference/test_llama_forward.py"
2742195759 Mar 30, 2024
3ee478e
polish code (#63087)
cyber-pioneer Mar 30, 2024
120e9bd
[CodeStyle][ruff] fix v0.3.3 UP032 (#63111)
gouzil Apr 1, 2024
aaceaa5
[CINN]Optimize compilation time (#63123)
BiynXu Apr 1, 2024
980f6f8
[Dy2St][PIR] Replace output with inplace source (#63040)
SigureMo Apr 1, 2024
7628d18
【AutoParalle】Transform BASE strategy between `dist.Strategy` and `fl…
heavyrain-lzy Apr 1, 2024
169f782
Fix Symetric Symmetric (#63139)
co63oc Apr 1, 2024
f6492d5
use by_pass instead of set_output when directly set input arg to outp…
HydrogenSulfate Apr 1, 2024
ccd2d91
[HACKATHON 6th] Fix clang-12 support (#63133)
silverling Apr 1, 2024
60503cf
fix typos (#82)
Fridge003 Apr 1, 2024
2f3d469
[Macro] Increase macro constant MAX_RANK_SUPPORTED (#63061)
HydrogenSulfate Apr 1, 2024
149e543
Support yaml (#63112)
xuxinyi389 Apr 1, 2024
1d18c95
filter generate_shape op at lowering (#84)
Fridge003 Apr 1, 2024
52984e3
improve the performence of divide_double_grad (#62533)
YibinLiu666 Apr 1, 2024
a128eca
修复Sequential英文文档 (#63128)
wufei2 Apr 1, 2024
f280f8e
[Inference] Pir support input/output hook (#63101)
Sunny-bot1 Apr 1, 2024
b55d55b
update
feifei-111 Apr 1, 2024
54e012d
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
feifei-111 Apr 1, 2024
72188ec
support op which numel is less than 32 into cinn (#63076)
zyfncg Apr 1, 2024
c2bf8d4
fix auto backward bug (#63113)
winter-wang Apr 1, 2024
31174be
paddle.summary 英文文档修改 (#63121)
smallpoxscattered Apr 1, 2024
30195b9
update
feifei-111 Apr 1, 2024
aed2d92
[PIR] support `matrix_norm` and fix backward redundant cast (#62958)
gouzil Apr 1, 2024
577d694
fix bug of lower group with broadcast branch (#63166)
zyfncg Apr 2, 2024
374fec1
Optimize the performance for fused_linear_param_grad_add when bias an…
Xreki Apr 2, 2024
642e41b
[Prim][PIR] group_norm decomp rule supports rank 3,4,5 and NHWC (#63136)
cyber-pioneer Apr 2, 2024
64bc84d
fix pylayer duplicable tensor input bug (#63155)
wanghuancoder Apr 2, 2024
abbfef3
[common] fix #63106 Incorrect segmentation (#63144)
gouzil Apr 2, 2024
1459312
[PIR][DynamicShape] Add symbolic shape infer for interpolate ops (#63…
zhangbopd Apr 2, 2024
f04e0d2
feat(custom device): enable memory event and stat record for custom d…
zhaohaixu Apr 2, 2024
993e06b
Fix spece spec, etc (#63092)
co63oc Apr 2, 2024
cc20882
Fix unity_build_rule.cmake files (#63147)
co63oc Apr 2, 2024
e3443bb
conv2d(by cutlass) supports tf32 (#63074)
zhink Apr 2, 2024
b032545
change new cluster flag to true
Apr 2, 2024
85c963d
Merge pull request #86 from Fridge003/cinn_tmp
2742195759 Apr 2, 2024
6780a03
【AutoParallel】Fix 'hang' using Sharding in unified model (#63049)
heavyrain-lzy Apr 2, 2024
2f0a842
fix pr63007 (#63170)
yuanlehome Apr 2, 2024
eaffd41
update
feifei-111 Apr 2, 2024
8497826
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
feifei-111 Apr 2, 2024
a8a79b5
update
feifei-111 Apr 2, 2024
7a1c458
update
feifei-111 Apr 2, 2024
24095af
update
feifei-111 Apr 2, 2024
d31573d
Rename operators/mkldnn operators/onednn (#63162)
co63oc Apr 2, 2024
5c15379
cinn(debug): fix tril op (#63169)
6clc Apr 2, 2024
b5a2bfa
update
feifei-111 Apr 2, 2024
33d42a3
rewrite broadcast logic
Apr 2, 2024
c1f5c39
[PIR inference]update add_rms_norm pass (#63154)
bukejiyu Apr 2, 2024
c5f73f6
[fix][dataloader] use file descripor instead of file system (#62696)
xysheng-baidu Apr 2, 2024
92f49a6
[Prim] Add stack_double_grad (#63161)
HydrogenSulfate Apr 2, 2024
5e406e8
[BUG FIX] Fix fused_weight_only_linesr_pass and ut (#63164)
yuanlehome Apr 2, 2024
2c234db
Merge pull request #85 from Fridge003/cinn
2742195759 Apr 2, 2024
09fe854
[Dy2St][PIR] Enable PIR ut `test_container` (#63182)
SigureMo Apr 2, 2024
aa27869
fix
2742195759 Apr 2, 2024
ccac768
support shape compute op into cinn (#63177)
zyfncg Apr 2, 2024
c69071d
update
feifei-111 Apr 2, 2024
f2d6696
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
feifei-111 Apr 2, 2024
5dfe454
support_clang_12 (#63152)
risemeup1 Apr 2, 2024
a1f5cdb
[Dy2St][PIR] Add `restore_out` in PIR `sot_call` (#63190)
SigureMo Apr 2, 2024
18b728d
add infer_symbol_shape for yield_store and remove tricky code of yiel…
zyfncg Apr 3, 2024
47ba282
[CINN]Fix bug of reshape infer symbol shape (#63175)
zyfncg Apr 3, 2024
620880a
[CINN] optimize symbol shape dim expr substitute in lower_cinn_fusion…
ZelinMa557 Apr 3, 2024
402c88d
【Hackathon 6th Fundable Projects No.3】part Remove fluid/operators/amp…
co63oc Apr 3, 2024
01140c7
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
feifei-111 Apr 3, 2024
971ddb4
API improvement for nn.initializer.XavierNormal and nn.initializer.Xa…
NKNaN Apr 3, 2024
c8471a5
[PIR][oneDNN] Add reshape_transpose_matmul_fuse_pass (#62998)
LLee233 Apr 3, 2024
2c0bbd3
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…
2742195759 Apr 3, 2024
52f9890
fix
2742195759 Apr 3, 2024
b996f0f
fix trivalop
2742195759 Apr 3, 2024
9974a2c
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
feifei-111 Apr 3, 2024
1626620
fix third_party
feifei-111 Apr 3, 2024
4050183
fix
2742195759 Apr 3, 2024
03e16d4
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
2742195759 Apr 3, 2024
9662a87
fix
2742195759 Apr 3, 2024
a6dca0d
fix third party
feifei-111 Apr 3, 2024
cde01f6
update
feifei-111 Apr 3, 2024
9a42158
fix
2742195759 Apr 5, 2024
51bbee3
Merge branch 'cinn-trivalop-fuse' of https://github.com/2742195759/Pa…
2742195759 Apr 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
28 changes: 0 additions & 28 deletions .flake8

This file was deleted.

12 changes: 8 additions & 4 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,13 @@
<!-- TemplateReference: https://github.com/PaddlePaddle/Paddle/wiki/PULL-REQUEST-TEMPLATE--REFERENCE -->
<!-- Demo: https://github.com/PaddlePaddle/Paddle/pull/24810 -->
### PR types
<!-- One of [ New features | Bug fixes | Function optimization | Performance optimization | Breaking changes | Others ] -->

### PR changes
<!-- One of [ OPs | APIs | Docs | Others ] -->
### PR Category
<!-- One of [ User Experience | Execute Infrastructure | Operator Mechanism | CINN | Custom Device | Performance Optimization | Distributed Strategy | Parameter Server | Communication Library | Auto Parallel | Inference | Environment Adaptation | Others ] -->


### PR Types
<!-- One of [ New features | Bug fixes | Improvements | Performance | BC Breaking | Deprecations | Docs | Devs | Not User Facing | Security | Deprecations | Others ] -->


### Description
<!-- Describe what you’ve done -->
9 changes: 2 additions & 7 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ repos:
# Exclude some unit test files that require tabs.
exclude: |
(?x)^(
test/dygraph_to_static/test_legacy_error.py
test/dygraph_to_static/test_error.py
)$
- repo: local
hooks:
Expand All @@ -56,13 +56,8 @@ repos:
hooks:
- id: black
files: (.*\.(py|pyi|bzl)|BUILD|.*\.BUILD|WORKSPACE)$
- repo: https://github.com/PyCQA/flake8
rev: 5.0.4
hooks:
- id: flake8
args: ["--config=.flake8"]
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.2.0
rev: v0.3.0
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix, --no-cache]
Expand Down
32 changes: 16 additions & 16 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,11 @@ option(WITH_ONNXRUNTIME "Compile PaddlePaddle with ONNXRUNTIME" OFF)
option(WITH_CUSPARSELT "Compile PaddlePaddle with CUSPARSELT" OFF)
option(WITH_SETUP_INSTALL "Compile PaddlePaddle with setup.py" OFF)
option(WITH_SHARED_PHI "Compile PaddlePaddle with SHARED LIB of PHI" ON)
option(CINN_ONLY "Compile CINN only in Paddle" OFF)
option(CINN_WITH_CUDNN "Compile CINN with CUDNN support" ON)

option(WITH_PIP_CUDA_LIBRARIES
"Paddle uses the CUDA library provided by NVIDIA" OFF)
option(WITH_NIGHTLY_BUILD
"Compile nightly paddle whl package of the develop branch" OFF)
find_package(Git REQUIRED)

# config GIT_URL with github mirrors to speed up dependent repos clone
Expand Down Expand Up @@ -97,11 +99,16 @@ endif()

if(WITH_GPU AND NOT APPLE)
#(Note risemeup1): The cudart dynamic library libcudart.so is used by set CUDA_USE_STATIC_CUDA_RUNTIME and CMAKE_CUDA_FLAGS
if(LINUX)
if(CMAKE_SYSTEM_NAME STREQUAL "Linux" AND CMAKE_SYSTEM_PROCESSOR STREQUAL
"x86_64")
set(CUDA_USE_STATIC_CUDA_RUNTIME
OFF
CACHE BOOL "" FORCE)
set(CMAKE_CUDA_FLAGS "--cudart shared")
if(WITH_PIP_CUDA_LIBRARIES)
#(Note risemeup1): Flag 'WITH_PIP_CUDA_LIBRARIES' will be used in dynamic_loader.cc to search for CUDA-related .so files through the Python libraries provided by NVIDIA.
add_definitions(-DWITH_PIP_CUDA_LIBRARIES)
endif()
endif()
enable_language(CUDA)
message(STATUS "CUDA compiler: ${CMAKE_CUDA_COMPILER}, version: "
Expand Down Expand Up @@ -135,7 +142,10 @@ endif()
if(WIN32)
option(MSVC_STATIC_CRT "use static C Runtime library by default" ON)
message("Build static library of PHI")
set(CMAKE_SUPPRESS_REGENERATION ON)
# (Note xuxinyi04): If CMAKE_SUPPRESS_REGENERATION is OFF, which is default, then CMake adds a
# special target on which all other targets depend that checks the build system and optionally
# re-runs CMake to regenerate the build system when the target specification source changes.
set(CMAKE_SUPPRESS_REGENERATION OFF)
set(CMAKE_STATIC_LIBRARY_PREFIX lib)
set(WITH_SHARED_PHI
OFF
Expand Down Expand Up @@ -233,6 +243,8 @@ if(WIN32)
"${${flag_var}} /ignore:4049 /ignore:4217 /ignore:4006 /ignore:4221")
if(MSVC_STATIC_CRT)
set(${flag_var} "${${flag_var}} /NODEFAULTLIB:MSVCRT.LIB")
else()
set(${flag_var} "${${flag_var}} /NODEFAULTLIB:LIBCMT.LIB")
endif()
endforeach()

Expand Down Expand Up @@ -618,18 +630,6 @@ if(WITH_CINN)

include(cmake/cinn.cmake)
add_definitions(-DPADDLE_WITH_CINN)

if(CINN_ONLY)
add_definitions(-DCINN_WITH_ONLY)
if(WITH_PYTHON)
add_subdirectory(python)
endif()
add_subdirectory(test)
if(NOT WITH_GFLAGS)
add_subdirectory(paddle/utils)
endif()
return()
endif()
endif()

#------------- cinn cmake config end --------------
Expand Down
5 changes: 3 additions & 2 deletions cmake/ccache.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,9 @@ if(NOT WIN32)
# show statistics summary of ccache
message("ccache version\t\t\t " ${ccache_version} "\n"
${cache_directory})
set_property(GLOBAL PROPERTY RULE_LAUNCH_COMPILE ${CCACHE_PATH})
set_property(GLOBAL PROPERTY RULE_LAUNCH_LINK ${CCACHE_PATH})
set(CMAKE_C_COMPILER_LAUNCHER ${CCACHE_PATH})
set(CMAKE_CXX_COMPILER_LAUNCHER ${CCACHE_PATH})
set(CMAKE_CUDA_COMPILER_LAUNCHER ${CCACHE_PATH})
endif()
elseif("${CMAKE_GENERATOR}" STREQUAL "Ninja")
# (Note:zhouwei25) Only Ninja Generator can support sccache now
Expand Down
40 changes: 16 additions & 24 deletions cmake/cinn.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -164,13 +164,13 @@ cinn_cc_library(
isl
ginac
pybind
group_cluster
cinn_op_dialect
${jitify_deps})
add_dependencies(cinnapi GEN_LLVM_RUNTIME_IR_HEADER ZLIB::ZLIB)
add_dependencies(cinnapi GEN_LLVM_RUNTIME_IR_HEADER ${core_deps})
if(NOT CINN_ONLY)
target_link_libraries(cinnapi op_dialect pir phi)
add_dependencies(cinnapi op_dialect pir phi)
endif()
target_link_libraries(cinnapi op_dialect pir phi)
add_dependencies(cinnapi op_dialect pir phi)

target_link_libraries(cinnapi ${PYTHON_LIBRARIES})

Expand All @@ -183,11 +183,6 @@ if(WITH_MKL)
endif()
endif()

if(CINN_ONLY)
target_link_libraries(cinnapi common)
add_dependencies(cinnapi common)
endif()

if(WITH_GPU)
target_link_libraries(
cinnapi
Expand Down Expand Up @@ -227,15 +222,17 @@ function(gen_cinncore LINKTYPE)
schedule_desc_proto
absl
isl
ginac)
ginac
pybind
group_cluster
cinn_op_dialect
${jitify_deps})
add_dependencies(${CINNCORE_TARGET} GEN_LLVM_RUNTIME_IR_HEADER ZLIB::ZLIB)
add_dependencies(${CINNCORE_TARGET} GEN_LLVM_RUNTIME_IR_HEADER ${core_deps})
if(NOT CINN_ONLY)
target_link_libraries(${CINNCORE_TARGET} op_dialect pir phi)
add_dependencies(${CINNCORE_TARGET} op_dialect pir phi)
endif()
target_link_libraries(${CINNCORE_TARGET} op_dialect pir phi)
add_dependencies(${CINNCORE_TARGET} op_dialect pir phi)

add_dependencies(${CINNCORE_TARGET} pybind)
# add_dependencies(${CINNCORE_TARGET} pybind)
target_link_libraries(${CINNCORE_TARGET} ${PYTHON_LIBRARIES})

if(WITH_MKL)
Expand All @@ -247,11 +244,6 @@ function(gen_cinncore LINKTYPE)
endif()
endif()

if(CINN_ONLY)
target_link_libraries(${CINNCORE_TARGET} common)
add_dependencies(${CINNCORE_TARGET} common)
endif()

if(WITH_GPU)
target_link_libraries(
${CINNCORE_TARGET}
Expand All @@ -261,16 +253,16 @@ function(gen_cinncore LINKTYPE)
${CUBLAS}
${CUDNN}
${CURAND}
${CUSOLVER}
${jitify_deps})
${CUSOLVER})
# ${jitify_deps})
if(NVTX_FOUND)
target_link_libraries(${CINNCORE_TARGET} ${CUDA_NVTX_LIB})
endif()
endif()

if(WITH_CUTLASS)
target_link_libraries(cinnapi cutlass)
add_dependencies(cinnapi cutlass)
target_link_libraries(${CINNCORE_TARGET} cutlass)
add_dependencies(${CINNCORE_TARGET} cutlass)
endif()
endfunction()

Expand Down
4 changes: 2 additions & 2 deletions cmake/coveralls.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,8 @@ endfunction()

if(WITH_COVERAGE)
if(WITH_INCREMENTAL_COVERAGE)
# if *.h changed, generate coverage report totaly.
# if pybind.cc changed, generate coverage report totaly.
# if *.h changed, generate coverage report totally.
# if pybind.cc changed, generate coverage report totally.
# Because if pybind.cc add '-g -O0 -fprofile-arcs -ftest-coverage' only, some testcase will fail.
if((NOT ("$ENV{PADDLE_GIT_DIFF_H_FILE}" STREQUAL ""))
OR ("$ENV{PADDLE_GIT_DIFF_CC_FILE}" MATCHES "pybind.cc"))
Expand Down
2 changes: 1 addition & 1 deletion cmake/coverallsGcovJsons.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,7 @@ foreach(GCOV_FILE ${GCOV_FILES})
# Instead of trying to parse the source from the
# gcov file, simply read the file contents from the source file.
# (Parsing it from the gcov is hard because C-code uses ; in many places
# which also happens to be the same as the CMake list delimeter).
# which also happens to be the same as the CMake list delimiter).
file(READ ${GCOV_SRC_PATH} GCOV_FILE_SOURCE)

string(REPLACE "\\" "\\\\" GCOV_FILE_SOURCE "${GCOV_FILE_SOURCE}")
Expand Down
2 changes: 1 addition & 1 deletion cmake/cuda.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,7 @@ select_nvcc_arch_flags(NVCC_FLAGS_EXTRA NVCC_ARCH_BIN)
set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} ${NVCC_FLAGS_EXTRA}")
message(STATUS "NVCC_FLAGS_EXTRA: ${NVCC_FLAGS_EXTRA}")

# Set C++14 support
# Set C++17 support
set(CUDA_PROPAGATE_HOST_FLAGS OFF)
# Release/Debug flags set by cmake. Such as -O3 -g -DNDEBUG etc.
# So, don't set these flags here.
Expand Down
2 changes: 1 addition & 1 deletion cmake/experiments/cuda_module_loading_lazy.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
# limitations under the License.

# this file contains experimental build options for lazy cuda module loading
# cuda moduel lazy loading is supported by CUDA 11.7+
# cuda module lazy loading is supported by CUDA 11.7+
# this experiment option makes Paddle supports lazy loading before CUDA 11.7.

if(LINUX)
Expand Down
46 changes: 35 additions & 11 deletions cmake/phi_header.cmake → cmake/export_paddle_header.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -15,33 +15,57 @@
set(PADDLE_INFERENCE_INSTALL_DIR
"${CMAKE_BINARY_DIR}/paddle_inference_install_dir")

function(phi_header_path_compat TARGET_PATH)
message(STATUS "phi header path compat processing: ${TARGET_PATH}")
function(header_path_compat TARGET_PATH)
message(STATUS "header path compat processing: ${TARGET_PATH}")
file(GLOB HEADERS "${TARGET_PATH}/*" "*.h")
foreach(header ${HEADERS})
if(${header} MATCHES ".*.h$")
file(READ ${header} HEADER_CONTENT)
string(REPLACE "paddle/fluid/platform/" "paddle/phi/" HEADER_CONTENT
"${HEADER_CONTENT}")
string(REPLACE "paddle/pir/include/" "paddle/pir/" HEADER_CONTENT
"${HEADER_CONTENT}")
string(REPLACE "paddle/fluid/pir/drr/include/" "paddle/pir/drr/"
HEADER_CONTENT "${HEADER_CONTENT}")
string(REPLACE "paddle/fluid/pir/utils/" "paddle/pir/utils/"
HEADER_CONTENT "${HEADER_CONTENT}")
file(WRITE ${header} "${HEADER_CONTENT}")
message(STATUS "phi header path compat processing complete: ${header}")
message(STATUS "header path compat processing complete: ${header}")
endif()
endforeach()
endfunction()

phi_header_path_compat(${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle)
phi_header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/phi)
phi_header_path_compat(
header_path_compat(${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle)
header_path_compat(${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/phi)
header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/phi/api)
phi_header_path_compat(
header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/phi/api/ext)
phi_header_path_compat(
header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/phi/api/include)
phi_header_path_compat(
header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/phi/common)
phi_header_path_compat(
header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/phi/core)
header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/pir/core)
header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/pir/core/parser)
header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/pir/dialect/control_flow/ir
)
header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/pir/dialect/shape/ir)
header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/pir/dialect/shape/utils)
header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/pir/drr)
header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/pir/pass)
header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/pir/pattern_rewrite)
header_path_compat(
${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/pir/utils)

# NOTE(liuyuanle): In inference lib, no need include paddle/utils/pybind.h, so we delete this.
file(READ ${PADDLE_INFERENCE_INSTALL_DIR}/paddle/include/paddle/extension.h
Expand Down
6 changes: 6 additions & 0 deletions cmake/external/cccl.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,18 @@ set(CCCL_INCLUDE_DIR ${CCCL_SOURCE_DIR})
message("CCCL_INCLUDE_DIR is ${CCCL_INCLUDE_DIR}")
include_directories(${CCCL_INCLUDE_DIR})

file(TO_NATIVE_PATH ${PADDLE_SOURCE_DIR}/patches/cccl/util_device.cuh.patch
native_src)
set(CCCL_PATCH_COMMAND git checkout -- . && git checkout ${CCCL_TAG} && patch
-p1 -Nd ${CCCL_SOURCE_DIR} < ${native_src})

ExternalProject_Add(
extern_cccl
${EXTERNAL_PROJECT_LOG_ARGS}
SOURCE_DIR ${CCCL_SOURCE_DIR}
PREFIX ${CCCL_PREFIX_DIR}
UPDATE_COMMAND ""
PATCH_COMMAND ${CCCL_PATCH_COMMAND}
CONFIGURE_COMMAND ""
BUILD_COMMAND ""
INSTALL_COMMAND ""
Expand Down
4 changes: 3 additions & 1 deletion cmake/external/dirent.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@ if((NOT DEFINED DIRENT_NAME) OR (NOT DEFINED DIRENT_URL))
set(DIRENT_URL
"${GIT_URL}/tronkko/dirent/archive/refs/tags/1.23.2.tar.gz"
CACHE STRING "" FORCE)
set(DIRENT_CACHE_FILENAME "1.23.2.tar.gz")
set(DIRENT_CACHE_FILENAME
"1.23.2.tar.gz"
CACHE STRING "" FORCE)
endif()

message(STATUS "DIRENT_NAME: ${DIRENT_NAME}, DIRENT_URL: ${DIRENT_URL}")
Expand Down
Loading