Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark| Potential solution for performance regressions #3473

Conversation

doublethefish
Copy link
Contributor

@doublethefish doublethefish commented Apr 3, 2020

Steps

  • (na) Add yourself to CONTRIBUTORS if you are a new contributor.
  • (na) Add a ChangeLog entry describing what your PR does.
  • (na) If it's a new feature or an important bug fix, add a What's New entry in doc/whatsnew/<current release.rst>.
  • Write a good description on what the PR does.

Description

These commits generate comparable performance data for a run of benchmarked code.

This is a potential 1st step in establishing performance regression tests. Later work would be to actually do the comparison against the target-branch somehow.

We use pytest-benchmark here, rather than asv, because pytest-benchmark was easier to integrate into the existing tests.

Use-case

A use-case for this functionality is:

  1. Write a good benchmark test:
  • See tests/benchmark/test_baseline_benchmarks.py for an example
  1. Run tox -e benchmark
  • This generates benchmark .json files in the .tox directory
  1. Apply some change that you suspect makes the code faster
  2. Re-run tox -e benchmark
  • This generates another benchmark .json file, probably 00002-something.json
  1. Compare stats:
  • In .tox/ run py.test-benchmark compare 0001 0002 - the numbers are used as a glob to find files that match, use the numbers that refer to the tox -e benchmark runs you want to compare.
  1. Look for statistically significant differences. You will need to use your common sense here.
  2. Profit.

Baseline benchmark for pylint

The baseline benchmark in this PR attempts to benchmark basic runs:

  • with no-files
  • a single empty file
  • lots of empty files
  • a single empty checker
  • many empty checkers
  • a checker that mimics doing expensive work
  • across workers and a single-processed

Type of Changes

Type
🔨 Refactoring (support of)
📜 Docs (new performance metrics)

Related Issue

@coveralls
Copy link

coveralls commented Apr 3, 2020

Coverage Status

Coverage remained the same at 90.449% when pulling ce02b98 on doublethefish:chore/1954/Add_performance_regressions into be5a61b on PyCQA:master.

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a much needed addition to be able to make performances better. I know it's a draft, but I reviewed some things let me know what you think.

tests/benchmark/test_baseline_benchmarks.py Show resolved Hide resolved
tests/test_functional.py Outdated Show resolved Hide resolved
tests/unittest_checker_similar.py Outdated Show resolved Hide resolved
tests/test_self.py Outdated Show resolved Hide resolved
@doublethefish
Copy link
Contributor Author

PR #3498 is a slice of this PR.

@doublethefish doublethefish changed the title Chore/1954/add performance regressions benchmark| Potential solution for performance regressions Apr 21, 2020
@doublethefish doublethefish force-pushed the chore/1954/Add_performance_regressions branch 2 times, most recently from 8c57c1c to 2337556 Compare April 21, 2020 09:29
Here we establish baseline benchmarks for the system when used in
minimal way.

Here we just confirm that -j1 vs -jN gives some boost in performance under
simple situations, establishing a baseline for other benchmarks.
@doublethefish doublethefish force-pushed the chore/1954/Add_performance_regressions branch from 2337556 to 5b23a24 Compare April 21, 2020 10:24
@doublethefish
Copy link
Contributor Author

Having used this code in anger, this PR has been updated to include the most useful change and I think it should go in.

I have removed some of the noise. The files that have been removed muddied the water:

  • The changes to tests/test_self.py and tests/test_functional.py were a "lets see what happens" commit and it is very debatable if those changes are useful.
    • This is because of how fast most of the tests are and the fact that we're not aiming for high-performance-computing.
    • There are performance wins to be made, especially in test_functional.py but those should be targeted with specific benchmarks.
  • Whilst the changes to tests/unittest_checker_similar.py are moderately useful, they can wait until we establish that this is the right way to do benchmarking.

@doublethefish doublethefish marked this pull request as ready for review April 21, 2020 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants