Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Stress Testing] - Runners #4127

Closed
noklam opened this issue Aug 29, 2024 · 0 comments · Fixed by #4210
Closed

[Stress Testing] - Runners #4127

noklam opened this issue Aug 29, 2024 · 0 comments · Fixed by #4210
Assignees

Comments

@noklam
Copy link
Contributor

noklam commented Aug 29, 2024

Description

The goal for this is test different runners,

In Kedro, there are 3 runners:

  • SequentialRunner
  • ThreadRunner (I/O bound, usually used for Spark pipeline)
  • ParallelRunner (CPU bound)

The test should measure:

The test should be target for the runner, for example we need to test I/O heavy pipeline for ThreadRunner

Context

#3957 (comment)

Component stress test:

  • The main goal for this is to benchmark performance of individual component, this will inform if refactoring work has positive/negative impact. Currently we only check if test pass, so we have no idea if a change may slow down performance. We have done this in the past but usually ad-hoc basis, we should run this regularly (or at least per release).

The direction of this is simple, we want to make measure the change of time against # number of entries. We would start with Datasets and Catalog, as this fits in the DataCatalog2.0 work and will be immediately useful.

This can address:

Should work with the setup decided in #4128

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants