Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSAgent-Bench的benchmark有没有具体的评估脚本? #502

Open
3 tasks done
YinSonglin1997 opened this issue Jun 24, 2024 · 0 comments
Open
3 tasks done

MSAgent-Bench的benchmark有没有具体的评估脚本? #502

YinSonglin1997 opened this issue Jun 24, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@YinSonglin1997
Copy link

Initial Checks

  • I have searched GitHub for a duplicate issue and I'm sure this is something new
  • I have read and followed the docs & demos and still think this is a bug
  • I am confident that the issue is with modelscope-agent (not my code, or another library in the ecosystem)

What happened + What you expected to happen

modelscope中MSAgent-Bench-中文Agent数据集提到,实验评估包含四个维度:
1.插件调用的准确率:识别api_name后面的是否正确,
2.插件url的准确率:url的地址是否正确
3.插件传入参数的准确率:parameters对应的参数是否正确
4.插件整体的准确率:生成的 function calling是否完全正确,整个json可以被load的格式
请问具体如何实现呢?有没有脚本参考?

Versions / Dependencies

modelscope 1.12.0
modelscope-agent 0.4.1
modelscope_studio 0.3.0
Python 3.10.14

Reproduction script

Issue Severity

None

@YinSonglin1997 YinSonglin1997 added the bug Something isn't working label Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants