Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement baselines as a fixture and with simple rebase support #1732

Merged
merged 10 commits into from
Feb 7, 2025

Conversation

uartie
Copy link
Contributor

@uartie uartie commented Jan 29, 2025

What does this PR do?

Implement a Baseline fixture to manage test case baselines via reference files instead of directly in the test code.

This allows for easier reference rebasing by simply passing the custom pytest option "--rebase" on the command-line to update the reference with the current test result.

It will also enable us to remove "baseline" from the test case signatures/names so they don't generate new signatures every time the baseline is updated or due to floating point precision differences from run-to-run.

A static/fixed test case signature ensures we can track the historical results of the test case in various tools/CI.


This is the first pull-request of multiple series to refactor how baselines are managed. This PR is meant to specifically address the test cases that used "baseline" in the test case signature, first. Future series will focus on moving the hard-coded baseline constants that exist in code in other test files.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@uartie uartie requested a review from regisss as a code owner January 29, 2025 19:51
@uartie uartie force-pushed the baseline-fixture branch 3 times, most recently from 2f7d969 to ab608b7 Compare January 30, 2025 16:28
@jiminha
Copy link
Collaborator

jiminha commented Jan 30, 2025

@uartie This looks great! I like that the baseline is separated from the test script itself, and the --rebase option seems really convenient for updating the baseline. I noticed that test_examples.py already uses baselines/*.json. Would your code be able to walkthrough those as well?

@jiminha
Copy link
Collaborator

jiminha commented Jan 30, 2025

@regisss We are working on upgrading our test framework to enable further automation. This is one of the first one, and more changes will come. Could you review this?

@uartie
Copy link
Contributor Author

uartie commented Jan 30, 2025

@uartie This looks great! I like that the baseline is separated from the test script itself, and the --rebase option seems really convenient for updating the baseline. I noticed that test_examples.py already uses baselines/*.json. Would your code be able to walkthrough those as well?

The test_examples.py baseline json files are a combination of test configuration parameters and test result references. I will explore ways to separate and integrate those with the new baseline fixture in a future patch series.

The immediate goal, in this series, is to fix test case signature/name variability via the new baseline fixture support. This will ensure we can manage separate test case manifest files, externally, easier.

@uartie uartie force-pushed the baseline-fixture branch 2 times, most recently from fa89345 to 26f194b Compare February 3, 2025 22:17
@jiminha jiminha requested review from libinta and jiminha February 4, 2025 21:42
@libinta
Copy link
Collaborator

libinta commented Feb 5, 2025

@uartie can you provide the test log with this change?

Fixture parameters in test cases are not emitted in the
test signature (i.e. nodeid).

Define the token as a fixture so that it does not generate
an arbitrary representation in the test signature.

Signed-off-by: U. Artie Eoff <[email protected]>
Implement a Baseline fixture to manage test case baselines
via reference files instead of directly in the test code.

This allows for easier reference rebasing by simply passing
the custom pytest option "--rebase" on the command-line to
replace the reference with the current test result.

It will also enable us to remove "baseline" from the test case
signatures (i.e. nodeid) so they don't generate new signatures
every time the baseline is updated or due to floating point
precision differences from run-to-run.

A static/fixed test case signature ensures we can track
the historical results of the test case in various tools/CI.

Signed-off-by: U. Artie Eoff <[email protected]>
Use the new baseline fixture to validate test results.

Signed-off-by: U. Artie Eoff <[email protected]>
Use the new baseline fixture to validate test results.

Also, remove dead code (e.g. token not used).

Signed-off-by: U. Artie Eoff <[email protected]>
Use the new baseline fixture to validate test results.

Signed-off-by: U. Artie Eoff <[email protected]>
Use the new baseline fixture to validate test results.

Signed-off-by: U. Artie Eoff <[email protected]>
Use the new baseline fixture to validate test results.

Signed-off-by: U. Artie Eoff <[email protected]>
Use the new baseline fixture to validate test results.

Signed-off-by: U. Artie Eoff <[email protected]>
Use the new baseline fixture to validate test results.

Signed-off-by: U. Artie Eoff <[email protected]>
Use the new baseline fixture to validate test results.

Signed-off-by: U. Artie Eoff <[email protected]>
@libinta libinta added the run-test Run CI for PRs from external contributors label Feb 6, 2025
@uartie
Copy link
Contributor Author

uartie commented Feb 6, 2025

@uartie can you provide the test log with this change?

@libinta The test results were the same with and without this PR.

+ python -m pytest tests/test_fsdp_examples.py -v -s --token=***
============================= test session starts ==============================
platform linux -- Python 3.10.12, pytest-7.4.4, pluggy-1.5.0 -- /usr/bin/python
collected 2 items
=================== 1 failed, 1 passed in 586.73s (0:09:46) ====================
+ python -m pytest tests/test_text_generation_example.py tests/test_encoder_decoder.py -v -s --token=***
============================= test session starts ==============================
platform linux -- Python 3.10.12, pytest-7.4.4, pluggy-1.5.0 -- /usr/bin/python
collected 66 items
================== 8 failed, 58 passed in 26648.24s (7:24:08) ==================
+ python -m pytest tests/test_image_to_text_example.py -v -s --token=***
============================= test session starts ==============================
platform linux -- Python 3.10.12, pytest-7.4.4, pluggy-1.5.0 -- /usr/bin/python
collected 12 items
================== 2 failed, 10 passed in 2001.76s (0:33:21) ===================
+ python -m pytest tests/test_openclip_vqa.py -v -s
============================= test session starts ==============================
platform linux -- Python 3.10.12, pytest-7.4.4, pluggy-1.5.0 -- /usr/bin/python
collected 2 items
======================== 2 passed in 200.33s (0:03:20) =========================
+ python -m pytest tests/test_sentence_transformers.py
============================= test session starts ==============================
platform linux -- Python 3.10.12, pytest-7.4.4, pluggy-1.5.0
collected 13 items
================== 13 passed, 1 warning in 180.71s (0:03:00) ===================
+ python -m pytest tests/test_pipeline.py -v -s --token=***
============================= test session starts ==============================
platform linux -- Python 3.10.12, pytest-7.4.4, pluggy-1.5.0 -- /usr/bin/python
collected 5 items
=================== 5 passed, 1 warning in 164.29s (0:02:44) ===================

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome PR 🚀
LGTM.

I left one comment but that doesn't prevent from merging the PR now.

tests/test_pipeline.py Show resolved Hide resolved
@regisss regisss merged commit 9e882f2 into huggingface:main Feb 7, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-test Run CI for PRs from external contributors transformers_future
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants