Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEST: test coverage for sklearnex SPMD ifaces #1777

Merged
merged 71 commits into from
Aug 1, 2024

Conversation

ethanglaser
Copy link
Contributor

@ethanglaser ethanglaser commented Mar 26, 2024

Description

Scope:

  • Introduces initial validation of spmd algos via mpi-pytest
  • Identify and create follow-up tasks for issues identified in initial testing

For each sklearnex spmd algorithm, individual manually-created tests and parametrized synthetic tests validate model attributes and prediction results against batch implementations

Known issues to address in follow-up:

  • KMeans centers not aligned after init
  • Number of KMeans iterations do not align
  • PCA falls back if mle or n_components integer
  • PCA fails if n_rows_rank < n_cols < n_rows
  • Forest results are not aligned
  • LinReg fails if n_rows_rank < n_cols < n_rows
  • LogReg model coefficients do not align
  • LogReg fails if n_rows_rank < n_cols < n_rows
  • kNN needs all classes represented on each process initially

@ethanglaser
Copy link
Contributor Author

/intelci: run

@ethanglaser
Copy link
Contributor Author

/intelci: run

@samir-nasibli samir-nasibli changed the title CI: Add mpi pytest validation TST: Add mpi pytest validation Apr 3, 2024
@ethanglaser
Copy link
Contributor Author

@ethanglaser ethanglaser changed the title TST: Add mpi pytest validation TEST: Add mpi pytest validation Apr 17, 2024
@ethanglaser
Copy link
Contributor Author

Update job with infra branch: http://intel-ci.intel.com/ef02681e-c97c-f1ba-a9d0-a4bf010d0e2e

Copy link
Contributor

@samir-nasibli samir-nasibli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. Thank you!

Two questions:

  1. I don't see skips in test suits for the know issues, is it possible define defects by disabling related test cases?
  2. does it work when a number of processes to run the test <2?
    Does it make sense to add @pytest.mark.mpi(min_size=2)?

spmd_result = EmpiricalCovariance_SPMD().fit(local_dpt_data)
batch_result = EmpiricalCovariance_Batch().fit(data)

assert_allclose(spmd_result.covariance_, batch_result.covariance_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you are mixing use of numpy's assert_allclose and your own implemented _spmd_assert_allclose. Why not to use it for all places?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certain attributes and results will require it and others do not. For example with linear regression, inference is done on chunks of data with spmd (whereas batch does entire data) therefore spmd version is used whereas coefficients are universal and require regular assert_allclose

Copy link
Contributor

@samir-nasibli samir-nasibli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it is not clear how you run this tests?
https://pytest-mpi.readthedocs.io/en/latest/usage.html#

I think make sense to add some readme files/update docs in case how to run this tests.

@olegkkruglov
Copy link
Contributor

I see also that different dtypes are not tested here. Is this parametrizing skipped on purpose?

@ethanglaser
Copy link
Contributor Author

I see also that different dtypes are not tested here. Is this parametrizing skipped on purpose?

Not on purpose - I will add it in. Good point.

@ethanglaser
Copy link
Contributor Author

Also it is not clear how you run this tests? https://pytest-mpi.readthedocs.io/en/latest/usage.html#

I think make sense to add some readme files/update docs in case how to run this tests.

Currently only added to internal CI since this is where we have GPU validation - see infra PR 712. Not sure if it makes sense to document in public repo.

@ethanglaser
Copy link
Contributor Author

  1. I don't see skips in test suits for the know issues, is it possible define defects by disabling related test cases?

There are pytest.skip( ) within the test functions - mostly because skips are dependent on parameters to function (except forest)

does it work when a number of processes to run the test <2?
Does it make sense to add @pytest.mark.mpi(min_size=2)?

Infra sets up tests to run with 4 processes, but it would just run batch GPU if 1 is used, so I don't see why it would not work. But all tests are set up to skip if mpi or gpu are not present

@samir-nasibli
Copy link
Contributor

Currently only added to internal CI since this is where we have GPU validation - see infra PR 712. Not sure if it makes sense to document in public repo.

We are adding a testing module to the public repository. It seems to me it makes sense to add a doc with info about how to launch, including some links to mpi-pytest is given.
Could be done in this PR or in the separate PR as well. We have a check box in our PR templates for adding/updatung docs.

@samir-nasibli
Copy link
Contributor

  1. I don't see skips in test suits for the know issues, is it possible define defects by disabling related test cases?

There are pytest.skip( ) within the test functions - mostly because skips are dependent on parameters to function (except forest)

Yes, I saw it before. My question raised because it is not everywhere. For example, for Kmeans I did not find the skip for an issue specified in the description.

Copy link
Contributor

@samir-nasibli samir-nasibli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Just minor comments, that could be probably addressed in the separate PRs.
Assuming green CI

@ethanglaser
Copy link
Contributor Author

Yes, I saw it before. My question raised because it is not everywhere. For example, for Kmeans I did not find the skip for an issue specified in the description.

True - in some cases its not handled by pytest.skip() but instead aspects of the test are commented. KMeans for instance the iter check is commented out and init check, because I want to have some validation still. If there is a good way to handle this instead with pytest.skip I am open to other ideas.

We are adding a testing module to the public repository. It seems to me it makes sense to add a doc with info about how to launch, including some links to mpi-pytest is given. Could be done in this PR or in the separate PR as well. We have a check box in our PR templates for adding/updatung docs.

I will update SPMD docs updates task to include this - we currently do not have documentation of our spmd interfaces at all so I don't know if it could be added yet.

Thanks for reviews. Still working out some CI fixes with introduction of float32 but will share job once its cleaner.

@samir-nasibli
Copy link
Contributor

I see some test fails on the latest CI job provided.
I will not dismiss my approve, but please attach the green CI link before the merge. Feel free to ask review again if required.

@ethanglaser ethanglaser merged commit 1e645b8 into uxlfoundation:main Aug 1, 2024
17 checks passed
ethanglaser added a commit that referenced this pull request Aug 1, 2024
* Add simple spmd pytest for basic stats

* add reason to skipif

* blacked

* isorted

* import revisions

* adding common spmd test functionality

* add simple covariance mpi pytest

* add linreg and auto data tests

* first draft complete

* follow-ups, TODOs, formatting

* fix knn manual tests

* deselect pca issues

* address minor CI fails

* improvements to _spmd_support.py

* get_local_tensor cleanup

* black

* add random state to statistic data gen

* manual to gold

* add logreg skip

* isort

* oops

* address some comments

* relative imports for _utils_spmd

* underscore prefix for test functions

* add underscore prefix to variable

* black

* revert relative import, use sklearnex

* revert unordered logic back to original

* black

* add dataframes support to testing

* black formatting

* black + minor fix

* oops

* skips for latest fails

* address comments, move _as_numpy usage to utils

* trying logreg gpu vs spmd instead of cpu

* minor follow up to logreg

* basicstats API upd changes, logreg thresholds

* oops

* oops neighbors

* loosen logreg threshold

* cleanup

* minor restorations and formatting

* add dtype parameter

* dtype uniform revision

* float32 threshold updates pt 1

* float32 threshold updates pt 2

* float32 threshold updates pt 3

* black

* skip neighbors check for float32

* final neighbors threshold update

* Update sklearnex/tests/_utils_spmd.py

Co-authored-by: olegkkruglov <[email protected]>

---------

Co-authored-by: olegkkruglov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Tests for sklearnex/daal4py/onedal4py & patching sklearn
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants