-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Output Comparison for pytest #146
Conversation
@@ -28,6 +28,7 @@ | |||
*.safetensors | |||
*.gguf | |||
*.vmfb | |||
iree_tests/onnx/node/generated/test_*/iree_output_*.npy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The iree_tests/
folder has its own .gitignore
file to keep the top level file cleaner: https://github.com/nod-ai/SHARK-TestSuite/blob/main/iree_tests/.gitignore
# TODO: add support for comparison of non numpy supported dtypes. using iree-run-module | ||
# numerical error | ||
self.test_numerical_accuracy() | ||
|
||
def test_numerical_accuracy(self): | ||
num_iree_output_files = len(list(self.test_cwd.glob("iree_output_*.npy"))) | ||
num_output_files = len(list(self.test_cwd.glob("output_*.npy"))) | ||
if num_iree_output_files != num_output_files: | ||
raise AssertionError(f"Number of golden outputs ({num_output_files}) and iree outputs ({num_iree_output_files}) dont match") | ||
|
||
for i in range(num_output_files): | ||
iree_output = load((self.test_cwd / f"iree_output_{i}.npy")) | ||
golden_output = load((self.test_cwd / f"output_{i}.npy")) | ||
assert_allclose(iree_output, golden_output, atol=self.atol, rtol=self.rtol, equal_nan=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to continue using --expected_output
, at least until we can prove that comparison in C++ is no longer sufficient.
- We have data types that numpy does not support. For that we can use binary files instead of .npy
This style of testing aims to have as thin a test runner as possible, leaning mostly on the native tools themselves. Right now all the test runner does is
- discover test cases
- run native tools (
iree-compile
,iree-run-module
) with flags - check the return codes
By having a thin test runner with a narrow set of responsibilities, other test runner implementations are possible and results are easier to reproduce outside of the test environment.
- Someone could write a ctest (or Bazel, or something else) test runner that uses these tools. Those runners may not have as direct access to Python utilities like numpy. We should also be able to run tests on systems without Python (Android, the web, etc.)
- Test case reproducers are just commands to run. This changes that to be more complicated - "to reproduce, run
iree-run-module ...
then run this numpy code"
How about we first see if we can modify https://github.com/openxla/iree/blob/main/runtime/src/iree/tooling/comparison.cc to be more permissive with numpy data type mismatches, or switch the expected outputs from numpy to binary files?
self.atol=1e-05 | ||
self.rtol=1e-06 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can put comparison thresholds in the flagfiles themselves, rather than make that a property of the test runner
Alternate approach to keep using expected output added here #212 |
Closing in favor of #212 |
Progress on iree-org/iree#16674