Add options for statistical tracking of success #20
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
Vox Media, LLC has been making use of a modified version of the
breakers
gem, in our primary internet-facing web application, for about four years now. This PR submission is a subset of our modifications, as described in #19 .These modifications improve the efficiency of
breakers
, and on busy systems they improve it dramatically, at the cost of some accuracy. The efficiency-vs-accuracy tradeoff can be dialed to any ratio desired.The underlying assumption is that success is the "99% case," and that it's worth optimizing for the 99% case.
Changes
Specifically, when using
breakers
with this PR's modifications, three new options may be specified inBreakers::Service
configuration:Each successful Faraday request sends INCR to redis only
p
% of the time, but it increments the stored value by 1/p
%, rather than sending a separate increment of 1 with each request.p
== 0.05, the client sends an INCR 20, but it randomly sends it 1 time out of 20 successful requests. This would cut write traffic by a factor of 20.The middleware sends a ZRANGE check to redis only once every
s
seconds, in any given ruby process, rather than a separate check with every Faraday request.A minimum number of errors
e
must be observed before an outage is reported.p
.The names of these configuration options are:
p
issuccess_sample_per
and defaults to 1.0 (no sampling)s
isseconds_between_outage_checks
and defaults to 0 (always check for existing outage)e
ismin_errors
and defaults to 1 (always check for creating outage)Those default values ensure exactly the same behavior for
Breakers::Service
as in existing versions of the gem, so this PR is 100% backwards compatible with existing installations.(Vox Media happens to have set those values at 0.1, 10, and 100, respectively, for the past several years.)
Tests
New tests have been added for each of these three features to
spec/integration_spec.rb
.I've verified that the tests pass when run on ruby 2.7.x (and bundler 2.x), and on faraday from 0.11.0 to 1.1.0. I haven't attempted to test against other versions, but I'd be surprised if it doesn't work on ruby 2.3+ and the existing faraday version spec. (If you'd be interested, I'd be happy to submit a separate PR to take advantage of GitHub Actions, Docker, and the
appraisal
gem to run the tests against a matrix of versions.)Notes
We've also pushed our work to the now-public repository https://github.com/voxmedia/breakers . This PR is a curated subset of that work. If you're interested, our primary web app is currently running at the version in that repo tagged
v1.0.1
. Questions or requests for changes are welcome! We'd like to thank the Department of Veterans Affairs for open-sourcing this work.I'm submitting this PR under the terms of the existing
LICENSE.md
in this repository. The authors of this work are Jamie McCarthy (@jamiemccarthy), Steve McKinney (@stephenmckinney), and Blake Thomson (@thomsbg).