-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【LLM Inference-0】Add Split MoE Op && Add Group MoE #69687
base: develop
Are you sure you want to change the base?
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
91d85bb
to
316e922
Compare
316e922
to
5bba5a3
Compare
bc58f6d
to
66d6660
Compare
2db7763
to
8769e56
Compare
Sorry to inform you that f52f14b's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
Sorry to inform you that 79d7e73's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
PR Category
Inference
PR Types
Others
Description
card-71500
1.增加拆分的moe算子,更便于做精度对齐,将分组softmax操作合并到一个moe_dispatch算子之中,精度已对齐。
2.增加group_moe功能的支持,新增group_moe接口。
3.补充group moe下的拆分算子的精度单测
4.补充拆分算子与之前baseline不同精度的单测