Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split tests to core/providers/task-sdk/integration/system #43979

Merged
merged 1 commit into from
Nov 15, 2024

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Nov 13, 2024

The tests execution was traditionally using single "breeze testing tests" command and you could select which test to run via TEST_TYPE.

However the recent move of providers and adding task_sdk necessitates splitting the tests commands into separate commands for core, providers, task_sdk, helm, integration and system.

This is done via introducing "TEST_GROUP" - which determines which group of tests is being executed, and dedicated testing command for each of the groups - with "db" and "non-db" variants where applicable.

Cleanup and small refactoring has been done to make it easier to reason about parameters passed down from the command line to docker and in-container pytest command.

Related: #42632


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member Author

potiuk commented Nov 14, 2024

OK. I moved it here from #43965 .

That one should be ready for review. I got it green, all the tests are nicely split between core and providers. I will be adding a little more diagnostics (Right now it is not obvious if the successful tests are doing what they are supposed to do) and running it for all versions of Python etc. to check that everythong is as expected + I will add runnig system tests - but other than that, I think it is in the shape that is pretty much ready to merge (and for sure ready to review).

Leter we can run a number optimizations, but the way it is and after I check it for all typos and run in multiple versions, we should be ready to go. I am quite away next week (in Ireland) - and will be less available again in the next week, but I am confident it's pretty stable (but will be able to fix things when on/off and with @gopidesupavan @romsharon98 @shahar1 @amoghrajesh @tirkarthi @ashb @kaxil @pierrejeambrun @vincbeck @o-nikolas and others who contributed recently, I think this way of running tests should be easier to reason about and fix.

Free screenshots to show the current state:

The "core" and "providers" tests now run in completely separate now - following the 'Stage 1` of #42632 (comment):

image

List of tests commands:

image

One thing I am not sure here is naming. Should it be:

  • integration-providers-tests

or

  • providers-integration-tests

Both have it's rationale, it's mostly about whether we want to group them by "type" or "what sub-system they are working on". Would love to have other's opinion on that.

Any and all comments are welcome! Yes. It's huge, but a lot of it is auto-generated and you can just mark it as "read" and it will go away.

@potiuk potiuk mentioned this pull request Nov 14, 2024
@ashb
Copy link
Member

ashb commented Nov 14, 2024

Just so I understand the plan, this splits out the running of tests, but doesn't change to code location which will be a future PR?

@ashb
Copy link
Member

ashb commented Nov 14, 2024

Interface question should the db/non-db be a flag that operates like a filter, so something like breeze testing core-tests --non-db etc?

Of particular note, I imagine that soon (as part of TaskSDK work) all the db unit tests for the providers will go away/become non-db tests. I don't know if that changes your thinking at all

@ashb
Copy link
Member

ashb commented Nov 14, 2024

+6,413 / −3,614 😱

Oh, this is mostly in the SVG images isn't it

Dockerfile.ci Show resolved Hide resolved
@vincbeck
Copy link
Contributor

Interface question should the db/non-db be a flag that operates like a filter, so something like breeze testing core-tests --non-db etc?

Of particular note, I imagine that soon (as part of TaskSDK work) all the db unit tests for the providers will go away/become non-db tests. I don't know if that changes your thinking at all

I agree, a flag to filter db/non-db tests would make more sense to me instead of having separate commands

Copy link
Member

@gopidesupavan gopidesupavan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

This is a fantastic simplification and a significant step forward for the upcoming planned stages. Thanks, Jarek, for the great optimization!

Now that each test variation has its own groups, I believe it will be much easier to make changes or extend functionality for any specific variation as needed in the future.

@potiuk potiuk force-pushed the split-tests-to-core-providers-task-sdk branch 3 times, most recently from 901b2e9 to 8dd4384 Compare November 15, 2024 00:58
@potiuk potiuk added full tests needed We need to run full set of tests for this PR to merge all versions If set, the CI build will be forced to use all versions of Python/K8S/DBs labels Nov 15, 2024
@potiuk potiuk force-pushed the split-tests-to-core-providers-task-sdk branch 2 times, most recently from 9c5778d to a4b392e Compare November 15, 2024 01:01
@potiuk
Copy link
Member Author

potiuk commented Nov 15, 2024

The whole set of test commands is simplified now:

Screenshot 2024-11-15 at 02 21 55

I also updated docs and examples and contributing docs.

I also reviewed and updated docs and fixed and simplified how sytem tests are run.

There is no more --system SYSTEM or pytest.mark.system("SYSTEM") but simply --system and pytest.mark.system. Also the example dags in "tests/system" and "providers/tests/system" are automatically marked with the "system" marker.

So tests shoudl be run:

In venv/inside breeze:

pytest --system providers/tests/system/google/cloud/bigquery/example_bigquery_queries.py

via Breeze - there is one command only:

breeze testing system-tests  providers/tests/system/google/cloud/bigquery/example_bigquery_queries.py

cc: @ahidalgob @kosteev @pankajkoti @fdemiane @sc250072

nce we merge it you will have to update the dashboards

The tests execution was traditionally using single "breeze testing
tests" command and you could select which test to run via TEST_TYPE.

However the recent move of providers and adding task_sdk necessitates
splitting the tests commands into separate commands for core, providers,
task_sdk, helm, integration and system.

This is done via introducing "TEST_GROUP" - which determines which
group of tests is being executed, and dedicated testing command for each
of the groups - with "db" and "non-db" variants where applicable.

Cleanup and small refactoring has been done to make it easier to
reason about parameters passed down from the command line to
docker and in-container pytest command.

Related: #42632
@potiuk potiuk force-pushed the split-tests-to-core-providers-task-sdk branch from a4b392e to 1250f9e Compare November 15, 2024 02:08
@potiuk potiuk merged commit 36e267d into main Nov 15, 2024
123 checks passed
@potiuk
Copy link
Member Author

potiuk commented Nov 15, 2024

And merged!

hardeybisey pushed a commit to hardeybisey/airflow that referenced this pull request Nov 15, 2024
The tests execution was traditionally using single "breeze testing
tests" command and you could select which test to run via TEST_TYPE.

However the recent move of providers and adding task_sdk necessitates
splitting the tests commands into separate commands for core, providers,
task_sdk, helm, integration and system.

This is done via introducing "TEST_GROUP" - which determines which
group of tests is being executed, and dedicated testing command for each
of the groups - with "db" and "non-db" variants where applicable.

Cleanup and small refactoring has been done to make it easier to
reason about parameters passed down from the command line to
docker and in-container pytest command.

Related: apache#42632
@kosteev
Copy link
Contributor

kosteev commented Nov 15, 2024

Thanks for heads up, Jarek.
We will update System Tests Dashboard for google provider package with regard to these changes.

potiuk added a commit to potiuk/airflow that referenced this pull request Nov 16, 2024
When apache#43979 there was a typo where inputs passed as test-groups
were not passed to unit tests in "special-tests" case - because
the "needs.build-info.outputs" were used instead.

Unfortunately this is not caught by GitHub parsing the workflows,
it will only signal it by having annotations of errors on the
affected actions - missing needs.build-info entries are simply
replaced by empty string.
potiuk added a commit to potiuk/airflow that referenced this pull request Nov 16, 2024
The apache#43979 refactoring of tests caused unnecessary database reset
attempts in tests that did not require it or had no database set.

This caused unnecessary `airflow db reset` in collection-only tests
with removed non-ARM packages, but also it caused the error printed
in non-DB tests:

```
Resetting the DB

[2024-11-16T03:50:37.812+0000] {cli_parser.py:67} ERROR - Failed to load CLI commands from executor: LocalExecutor
Traceback (most recent call last):
  File "/opt/airflow/airflow/cli/cli_parser.py", line 64, in <module>
    executor, _ = ExecutorLoader.import_executor_cls(executor_name)
  File "/opt/airflow/airflow/executors/executor_loader.py", line 285, in import_executor_cls
    return _import_and_validate(executor_name.module_path), executor_name.connector_source
  File "/opt/airflow/airflow/executors/executor_loader.py", line 282, in _import_and_validate
    cls.validate_database_executor_compatibility(executor)
  File "/opt/airflow/airflow/executors/executor_loader.py", line 327, in validate_database_executor_compatibility
    if engine.dialect.name == "sqlite":
AttributeError: 'NoneType' object has no attribute 'dialect'
[2024-11-16T03:50:37.813+0000] {cli_parser.py:68} ERROR - Ensure all dependencies are met and try again. If using a Celery based executor install a 3.3.0+ version of the Celery provider. If using a Kubernetes executor, install a 7.4.0+ version of the CNCF provider
Traceback (most recent call last):
  File "/usr/local/bin/airflow", line 8, in <module>
    sys.exit(main())
  File "/opt/airflow/airflow/__main__.py", line 62, in main
    args.func(args)
  File "/opt/airflow/airflow/cli/cli_config.py", line 49, in command
    return func(*args, **kwargs)
  File "/opt/airflow/airflow/utils/providers_configuration_loader.py", line 55, in wrapped_function
    return func(*args, **kwargs)
  File "/opt/airflow/airflow/cli/commands/db_command.py", line 63, in resetdb
    print(f"DB: {settings.engine.url!r}")
AttributeError: 'NoneType' object has no attribute 'url'

Database has been reset
```

The fix is to add `--no-db-reset` in collection tests and force
db_reset = False in case `skip_db_tests` is set to True.
potiuk added a commit that referenced this pull request Nov 16, 2024
When #43979 there was a typo where inputs passed as test-groups
were not passed to unit tests in "special-tests" case - because
the "needs.build-info.outputs" were used instead.

Unfortunately this is not caught by GitHub parsing the workflows,
it will only signal it by having annotations of errors on the
affected actions - missing needs.build-info entries are simply
replaced by empty string.
potiuk added a commit to potiuk/airflow that referenced this pull request Nov 16, 2024
The apache#43979 refactoring of tests caused unnecessary database reset
attempts in tests that did not require it or had no database set.

This caused unnecessary `airflow db reset` in collection-only tests
with removed non-ARM packages, but also it caused the error printed
in non-DB tests:

```
Resetting the DB

[2024-11-16T03:50:37.812+0000] {cli_parser.py:67} ERROR - Failed to load CLI commands from executor: LocalExecutor
Traceback (most recent call last):
  File "/opt/airflow/airflow/cli/cli_parser.py", line 64, in <module>
    executor, _ = ExecutorLoader.import_executor_cls(executor_name)
  File "/opt/airflow/airflow/executors/executor_loader.py", line 285, in import_executor_cls
    return _import_and_validate(executor_name.module_path), executor_name.connector_source
  File "/opt/airflow/airflow/executors/executor_loader.py", line 282, in _import_and_validate
    cls.validate_database_executor_compatibility(executor)
  File "/opt/airflow/airflow/executors/executor_loader.py", line 327, in validate_database_executor_compatibility
    if engine.dialect.name == "sqlite":
AttributeError: 'NoneType' object has no attribute 'dialect'
[2024-11-16T03:50:37.813+0000] {cli_parser.py:68} ERROR - Ensure all dependencies are met and try again. If using a Celery based executor install a 3.3.0+ version of the Celery provider. If using a Kubernetes executor, install a 7.4.0+ version of the CNCF provider
Traceback (most recent call last):
  File "/usr/local/bin/airflow", line 8, in <module>
    sys.exit(main())
  File "/opt/airflow/airflow/__main__.py", line 62, in main
    args.func(args)
  File "/opt/airflow/airflow/cli/cli_config.py", line 49, in command
    return func(*args, **kwargs)
  File "/opt/airflow/airflow/utils/providers_configuration_loader.py", line 55, in wrapped_function
    return func(*args, **kwargs)
  File "/opt/airflow/airflow/cli/commands/db_command.py", line 63, in resetdb
    print(f"DB: {settings.engine.url!r}")
AttributeError: 'NoneType' object has no attribute 'url'

Database has been reset
```

The fix is to add `--no-db-reset` in collection tests and force
db_reset = False in case `skip_db_tests` is set to True.
potiuk added a commit to potiuk/airflow that referenced this pull request Nov 16, 2024
The apache#43979 introduced unit_tests.sh script that did not properly
propagate error codes (no -e/+e setting).

This PR fixes it - also fixes an edge case where the -e setting
is not restored in run breeze commmand script.
potiuk added a commit to potiuk/airflow that referenced this pull request Nov 16, 2024
The apache#43979 introduced unit_tests.sh script that did not properly
propagate error codes (no -e/+e setting).

This PR fixes it - also fixes an edge case where the -e setting
is not restored in run breeze commmand script.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
all versions If set, the CI build will be forced to use all versions of Python/K8S/DBs area:dev-tools full tests needed We need to run full set of tests for this PR to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants