Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEST: using get_dataframes_and_queues instead of get_queues in onedal4py testing #1909

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

samir-nasibli
Copy link
Contributor

@samir-nasibli samir-nasibli commented Jul 2, 2024

Description

using get_dataframes_and_queues instead of get_queues in onedal4py testing.
Enabled only for covariance, incremental covarience, incremental pca (just for verifying the primitive work)

Justification

onedal4py API and SPMD APIs have ability to work with numpy and sycl_queue provided. Currently we are testing them with numpy with None queue or with dpctl/dpnp with None queue param. This PR updates get_dataframes_and_queues testing primitive, enhancement includes cases numpy + sycl_queue.

TODO

Separately enable it for all cases where get_queues used + SPMD iface (#1777 )

Should be

  • I have reviewed my changes thoroughly before submitting this pull request.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes, if necessary (updated in # - add PR number)
  • The unit tests pass successfully.
  • I have run it locally and tested the changes extensively.
  • I have resolved any merge conflicts that might occur with the base branch.
  • Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details)
  • I have added a respective label(s) to PR if I have a permission for that.

@samir-nasibli
Copy link
Contributor Author

/intelci: run

@samir-nasibli
Copy link
Contributor Author

/intelci: run

@samir-nasibli samir-nasibli added the testing Tests for sklearnex/daal4py/onedal4py & patching sklearn label Jul 4, 2024
@samir-nasibli
Copy link
Contributor Author

/intelci: run

@samir-nasibli
Copy link
Contributor Author

/intelci: run

@samir-nasibli
Copy link
Contributor Author

/intelci: run

@samir-nasibli samir-nasibli marked this pull request as ready for review July 26, 2024 07:23
@samir-nasibli
Copy link
Contributor Author

/intelci: run

Copy link
Contributor

@icfaust icfaust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since np_sycl isn't a dataframe type unlike array_api, pandas, numpy, dpctl, or dpnp, I'd ask that you write comments in the code in the instances where np_sycl is used to give future developers the reasoning why its there/ flag it for special observation. I'd also add some comments into get_dataframes_and_queues. I had been thinking why we just don't extend numpy to include queues. Did you run that and see how much it increases CI runtimes?

@samir-nasibli
Copy link
Contributor Author

Since np_sycl isn't a dataframe type unlike array_api, pandas, numpy, dpctl, or dpnp, I'd ask that you write comments in the code in the instances where np_sycl is used to give future developers the reasoning why its there/ flag it for special observation. I'd also add some comments into get_dataframes_and_queues. I had been thinking why we just don't extend numpy to include queues. Did you run that and see how much it increases CI runtimes?

Yeah, make sense to share statistics of the run. Also definitely docs needed to be updated. Thank you!

@icfaust
Copy link
Contributor

icfaust commented Oct 7, 2024

This will need to be rebased for 2025.0

@samir-nasibli
Copy link
Contributor Author

Some updates are required. Moving back to the draft.

@samir-nasibli samir-nasibli marked this pull request as draft October 7, 2024 09:18
@syakov-intel
Copy link

@samir-nasibli what needs to be done to get out of Draft state?

@samir-nasibli samir-nasibli marked this pull request as ready for review December 10, 2024 14:57
Comment on lines +100 to +104
if "np_sycl" in dataframe_filter_:
# sycl queue param is not None.
# Designed for interfaces that utilize NumPy inputs with a DPCTL queue,
# enabling offloading to specific SYCL devices.
dataframes_and_queues.extend(get_df_and_q("numpy_and_queue"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@icfaust

  1. what do you think about the naming and this approach specifically? Does it make sense to leave only numpy_and_queue instead np_sycl?

  2. Maybe it is better to use param: numpy_with_queue=True

I mean:

get_dataframes_and_queues("numpy", numpy_and_queue=True))

instead of:

get_dataframes_and_queues("numpy,np_sycl"))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the second option could be generalized for all dataframes that doesn't support sycl_queue, but when sycl_queue is provided

@samir-nasibli
Copy link
Contributor Author

/intelci: run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Tests for sklearnex/daal4py/onedal4py & patching sklearn
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants