-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create benchmark harness using a stubbed HTTP client #9239
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Can you add a README to the
scripts/performance
directory covering:- How to run the current benchmarks
- How to add a new benchmark, including a definition for each of the JSON keys (a JSON schema would be nice too, but okay to skip for now).
- What is the plan for the existing S3 benchmark scripts? If this has feature parity, can we delete them?
|
scripts/performance/run-benchmarks
Outdated
""" | ||
A generic stubbed HTTP client. | ||
""" | ||
def setup(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe the choice of the interface for this? It looks like its similar to a unittest.TestCase
. Why use this instead of a standard __init__
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted the client object to be reusable across different iterations / command benchmarks. Using setup and teardown allows us to setup and reset the state of the client in between each invocation.
In particular, it prevents the case where the user supplies more stubbed responses than there are actual requests made, leaving the responses list non-empty, and thus the next execution would reuse stale stubs from the previous execution.
Alternatively I'd be reinstantiating a new client for each iteration/execution, which seems redundant.
scripts/performance/run-benchmarks
Outdated
current_time = time.time() | ||
|
||
# Save all the data into a CSV file. | ||
f.write( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Could you use csv
library here? That might help to make sure the data file has descriptive headers. The README should have an accompanying doc/data dictionary to describe the values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original scripts used csv files for writing / reading, presumably since s3transfer commands can run for quite a while and not having to load all the samples collected into memory.
In the next revision, I ended up removing the file I/O for benchmark and summarizing entirely, since we should just load all samples into memory anyways to compute percentiles. Should also save some disk operation overhead.
I have not observed the number of samples exceed 9,000 in my testing.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## benchmarks #9239 +/- ##
============================================
Coverage ? 0
============================================
Files ? 0
Lines ? 0
Branches ? 0
============================================
Hits ? 0
Misses ? 0
Partials ? 0 ☔ View full report in Codecov by Sentry. |
Description of changes:
Description of tests:
`./scripts/performance/run-benchmarks --result-dir ~/Desktop/results --num-iterations 1
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.