-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
62221 Parallelise the performance tests #8190
62221 Parallelise the performance tests #8190
Conversation
…ance tests as they do currently.
@swissspidy @desrosj @sirreal @joemcgill What are your thoughts on this approach? The PR still needs some tweaks but the bulk of it is ready. Latest results here: https://github.com/WordPress/wordpress-develop/actions/runs/13012122443 |
IIRC @dmsnell looked into that before and there can actually be quite some fluctuation, even depending on the time of day. |
Thanks for the note, @swissspidy — I did in fact observe consistent variation when attempting to answer the question, “Does this release exhibit the same response times for the home page as the old release does?” That variation existed on the order of hours and while I’m sure there was an explanation, it was on an isolated test runner off Github and I did’t dive into figuring out what caused it. The impact of this variation was massive, however, significantly higher than any code changes, so my recommendation for performance testing is that run tests over the course of at least six to ten hours and interleave runs of the control and test groups so that we could avoid this long-running bias, in case our tests happen to run near the point where they appear or disappear. For CI jobs where the results are more like suggestions or questions about performance than official or reliable reporting, I don’t see why this should be a blocker. The tests can be re-run. In fact, at some point I think I also considered running the tests concurrently since they only maxed out a single CPU core. Running on two or more cores would end up being a more realistic test environment, even though it introduces more uncertainty. |
Thanks for the info Dennis! In that case I'll carry on with this PR (as it still needs work) because it seems that running the current/before/base tests in parallel won't affect the overall concern about the accuracy of the comparisons compared to what we have now. There's a chance that they'll actually be more accurate because all three tests will start from a fresh installation, but looking back at the workflow runs on this PR it doesn't seem to make a measurable difference. But I'll continue looking at the numbers. |
# Conflicts: # .github/workflows/end-to-end-tests.yml # .github/workflows/phpunit-tests.yml # .github/workflows/test-build-processes.yml
Originally, we only had two runs: "current" and "base", where the baseline is only run on commits to trunk (e.g., However, your observations about the way these are already potentially affecting each other by reusing the same DB is valid, i.e.:
In most cases, these tests are just being used as a spot check for folks reviewing the code if they suspect there might be a performance concern with a PR before it's committed, and for that use case, speeding up these runs seems much more valuable than isolating environmental side-effects. So running the "before" tests in parallel at minimum seems like a good idea. For the base tests, I suspect that any commits that consistently lead to a major regression would still be visible in the dashboard if these are run in parallel, so it's probably worth the experiment to parallelize those as well, so I'm ok with us giving it a try and observing whether it negatively affects our ability to identify performance regressions during a release. |
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN:
To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Committed in https://core.trac.wordpress.org/changeset/59749 |
This change introduces a job matrix for the "current", "before", and "base" performance tests to replace the current behaviour of running them sequentially in a single job.
This speeds up the overall performance workflow run by 18-20 minutes 🕐 .
Reasoning
Notes
Future enhancements
Todo
Trac ticket: https://core.trac.wordpress.org/ticket/62221