-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
benchdnn: graph: extend --dt to support specifying tensor id #2331
Conversation
086c6dd
to
ab80ed6
Compare
ab80ed6
to
539767a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few comments, please incorporate as you see fit, thanks!
539767a
to
143ce80
Compare
Thank you @ranukund . Suggestions are incorporated now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thank you!
make test |
3e42fa0
to
7d3f6c3
Compare
tests/benchdnn/graph/bench_graph.cpp
Outdated
BENCHDNN_PRINT(0, "%s\n", | ||
"Error: --dt is specified twice with different styles."); | ||
SAFE_V(FAIL); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move it into parse_dt
since it has both objects and can check their state to figure out if both options were passed. Afterall, it's a parser obligation to make sure the input is valid (when it can).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, it's moved to the parser function now.
2fd06e0
to
4ee753e
Compare
rewrite sdpa-compressed-kv-int4-gs32.json for it.
4ee753e
to
575e29e
Compare
make test |
We introduced
--dt
in 461388e to support floating-point data type rewrite for the whole graph.In this PR, the knob is further extended to support data type rewrite for specified tensor IDs, in a format of
--dt=ID0:DT0+ID1:DT1...
. It provides more flexibility to test a json file with different data types, especially for the various zps/scales data types for quantizations which were excluded previously in whole graph data type rewrite. Due to the flexibility of this knob and the restriction of op definition in the library, users need to be cautious when rewriting the data type of a specific tensor in a graph as it may lead to graph construction failures in the library during testing (eg. incompatible input and output tensor data types).An example as below:
benchdnn --graph --dt=0:s8+2:s8+6:s8+8:s8 --case=complex_fusion/mha/sdpa-compressed-kv-int4-gs32.json
It can be used to test int8 compressed KV graph by rewriting rewrite the int4 compressed KV graph (0 is the key tensor, 2 is the zps pf key tensor, 6 is the value tensor, and 8 is the zps of values tensor).