Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Github Actions workflow to trigger pipeline performance test #4231

Merged
merged 12 commits into from
Oct 25, 2024

Conversation

lrcouto
Copy link
Contributor

@lrcouto lrcouto commented Oct 15, 2024

Description

Adds a workflow to trigger a pipeline performance test on the performance test repo if the user adds the "performance" label to the pull request they wish to test.

The test will run a comparison in elapsed time between the branch with the performance label against the latest kedro release.

Development notes

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

  • Read the contributing guidelines
  • Signed off each commit with a Developer Certificate of Origin (DCO)
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes
  • Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

@lrcouto lrcouto requested review from noklam and ankatiyar October 15, 2024 16:05
@lrcouto lrcouto marked this pull request as ready for review October 15, 2024 16:05
@lrcouto lrcouto requested a review from merelcht as a code owner October 15, 2024 16:05
@ankatiyar
Copy link
Contributor

My initial thought is that this should be triggered not on every push or based on the commit message because someone might want to trigger this after making all the changes so making a random commit just to trigger a pipeline test seems counterintuitive.
Alternatives - Adding a "performance" label to the PR: https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#pull_request
WDYT?

Signed-off-by: Laura Couto <[email protected]>
@lrcouto
Copy link
Contributor Author

lrcouto commented Oct 15, 2024

My initial thought is that this should be triggered not on every push or based on the commit message because someone might want to trigger this after making all the changes so making a random commit just to trigger a pipeline test seems counterintuitive. Alternatives - Adding a "performance" label to the PR: https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/events-that-trigger-workflows#pull_request WDYT?

I like the idea!

Signed-off-by: Laura Couto <[email protected]>
@noklam
Copy link
Contributor

noklam commented Oct 15, 2024

Agree on above, I like the idea about label better. How can we trigger it multiple times with label? Perhaps is it possible to trigger the workflow manually with parameter (commit hash/branch name?)

run: |
echo "Commit message contains 'PERFORMANCE TEST'. Triggering performance test."

- name: Trigger performance test workflow in test project
Copy link
Contributor

@ankatiyar ankatiyar Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I think you can use the checkout action to clone the performance test repo - https://github.com/actions/checkout
and then call kedro run

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this been addressed?

@noklam noklam linked an issue Oct 16, 2024 that may be closed by this pull request
@lrcouto
Copy link
Contributor Author

lrcouto commented Oct 16, 2024

Agree on above, I like the idea about label better. How can we trigger it multiple times with label? Perhaps is it possible to trigger the workflow manually with parameter (commit hash/branch name?)

Maybe we can have an alternative way to trigger from the test repo itself with a specific branch.

Signed-off-by: Laura Couto <[email protected]>
@lrcouto
Copy link
Contributor Author

lrcouto commented Oct 16, 2024

I've changed the trigger method to use a label, and created a "performance" label to test it out. It will trigger when the label is added, and you can repeat the test by removing the label and adding it again or running the job again.

Now for a little bit of a philosophical question - My intention was to use commits not only to trigger the test itself but also to set those hook/save/load delay parameters that the test has. If we use a label instead of the commit to trigger the run, how could we set the delays? Maybe we could have both methods available, so you can use the label for a simple test but a commit if you want to manually alter the delays?

This is the test branch I'm using, by the way: #4233

@merelcht
Copy link
Member

I've changed the trigger method to use a label, and created a "performance" label to test it out. It will trigger when the label is added, and you can repeat the test by removing the label and adding it again or running the job again.

I really like the label trigger!

Now for a little bit of a philosophical question - My intention was to use commits not only to trigger the test itself but also to set those hook/save/load delay parameters that the test has. If we use a label instead of the commit to trigger the run, how could we set the delays? Maybe we could have both methods available, so you can use the label for a simple test but a commit if you want to manually alter the delays?

This is the test branch I'm using, by the way: #4233

@lrcouto How do you set those parameters in the commit? And what are the default now?

@lrcouto
Copy link
Contributor Author

lrcouto commented Oct 17, 2024

@lrcouto How do you set those parameters in the commit? And what are the default now?

The default right now was no delays. My initial idea was passing the parameter line on the commit message and sending it on the POST payload, same as it's being done with the branch name right now. But I'm looking for alternatives that would work with the label solution.

@lrcouto
Copy link
Contributor Author

lrcouto commented Oct 17, 2024

@lrcouto How do you set those parameters in the commit? And what are the default now?

The default right now was no delays. My initial idea was passing the parameter line on the commit message and sending it on the POST payload, same as it's being done with the branch name right now. But I'm looking for alternatives that would work with the label solution.

I thought of two possible alternatives for this issue that we could use. What I would like to do is allow the user to pass all of the parameters to kedro run, so not only you can use the delay values specific to the test project but you could also, for example, set it to run with a specific runner.

The first option would be using PR comments to pass parameters and run the workflow. It's easy to use but the disadvantage is that we would pollute the PRs with parameter comments, which makes me not like this option very much.

The second would be to have a manually dispatched workflow on the test repo, in which you could go and input the parameters. Then you could use the performance label one for a quick run with all default values, or you could do something with more customization by going to the test repo itself and running it manually. I'm leaning more towards this one.

Let me know what you think!

@merelcht
Copy link
Member

Maybe for a first iteration just triggering the test as is, is fine and we can create a follow up ticket to explore parameters? #4210 is already meant to check the different runners, so my question is whether it makes sense to also have that as a changing factor for this setup?

@noklam
Copy link
Contributor

noklam commented Oct 18, 2024

Agree with @merelcht, I think having parameters in commit are also a bit overkill and quite easy to break. I would prefer a sensible default (that are non-zero, otherwise it doesn't shows anything). If there are more needs to extend the parameters, we can always clone the repository and run with source code directly.

@lrcouto
Copy link
Contributor Author

lrcouto commented Oct 18, 2024

We can leave this as it is for now, and extend it in the future then.

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

@lrcouto lrcouto requested review from merelcht and ankatiyar October 18, 2024 14:38
@ankatiyar
Copy link
Contributor

I do think you could checkout the other repo in this workflow and time and run it so one could see the results in the PR itself instead of triggering the workflow on the repo that contains the project where we would have to look through the Actions tab to see what the results are.

Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me besides some comments left by @ankatiyar that haven't been addressed.

run: |
echo "Commit message contains 'PERFORMANCE TEST'. Triggering performance test."

- name: Trigger performance test workflow in test project
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this been addressed?

@lrcouto
Copy link
Contributor Author

lrcouto commented Oct 24, 2024

I do think you could checkout the other repo in this workflow and time and run it so one could see the results in the PR itself instead of triggering the workflow on the repo that contains the project where we would have to look through the Actions tab to see what the results are.

Yeah I think that would be more convenient. I'm rewriting the workflow to work in this way.

Copy link
Contributor

@ankatiyar ankatiyar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @lrcouto 💯


- name: Set up Python 3.11
if: github.event.action == 'labeled' && contains(github.event.label.name, 'performance')
uses: actions/setup-python@v4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be v5 I think there's some deprecations in v4

@lrcouto lrcouto merged commit a5d9bb4 into main Oct 25, 2024
28 checks passed
@lrcouto lrcouto deleted the add-pipe-performance-workflow branch October 25, 2024 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants