MSAgent-Bench的benchmark有没有具体的评估脚本？ #502

YinSonglin1997 · 2024-06-24T07:04:12Z

Initial Checks

I have searched GitHub for a duplicate issue and I'm sure this is something new
I have read and followed the docs & demos and still think this is a bug
I am confident that the issue is with modelscope-agent (not my code, or another library in the ecosystem)

What happened + What you expected to happen

modelscope中MSAgent-Bench-中文Agent数据集提到，实验评估包含四个维度：
1.插件调用的准确率：识别api_name后面的是否正确，
2.插件url的准确率：url的地址是否正确
3.插件传入参数的准确率：parameters对应的参数是否正确
4.插件整体的准确率：生成的 function calling是否完全正确，整个json可以被load的格式
请问具体如何实现呢？有没有脚本参考？

Versions / Dependencies

modelscope 1.12.0
modelscope-agent 0.4.1
modelscope_studio 0.3.0
Python 3.10.14

Reproduction script

Issue Severity

None

YinSonglin1997 added the bug Something isn't working label Jun 24, 2024

YinSonglin1997 assigned suluyana and zzhangpurdue Jun 24, 2024

zzhangpurdue assigned lcl6679292 and unassigned zzhangpurdue and suluyana Jun 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MSAgent-Bench的benchmark有没有具体的评估脚本？ #502

MSAgent-Bench的benchmark有没有具体的评估脚本？ #502

YinSonglin1997 commented Jun 24, 2024

MSAgent-Bench的benchmark有没有具体的评估脚本？ #502

MSAgent-Bench的benchmark有没有具体的评估脚本？ #502

Comments

YinSonglin1997 commented Jun 24, 2024

Initial Checks

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity