You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I and two of my (now-former) colleagues at Vox Media, @stephenmckinney and @thomsbg, have added some functionality to breakers which we would like to contribute back to this project. Before submitting a PR, I wanted to check with you whether this work would be considered valuable.
Because Vox Media wanted to use breakers on a high-traffic connection between two of our applications, we wanted to avoid excess traffic to redis, and to avoid declaring an outage for minor glitches. This allowed us to optimize the 99.9% case where both applications are functioning correctly.
We have been running our custom version for about four years now, and it runs so reliably that we are barely even aware of it anymore!
Current vs. desired behavior
The current behavior is:
Each successful Faraday request sends an INCR to redis, to track the exact number of successes.
The middleware sends a ZRANGE to redis prior to every request, to check whether there is an outage.
If traffic is low, a single failure can trigger an outage.
When a plugin is notified about an error due to an exception, that exception is not passed along.
The desired behavior (which we've implemented) is to add three configurable parameters p, s, and e:
An INCR is sent only p% of the time, but it increments the stored value by 1/p%. So if p == 5%, the client sends an INCR 20 but it randomly sends it 1 time out of 20 successful requests, cutting write traffic by a factor of 20.
A ZRANGE check is only made once every s seconds, in any given ruby process. This is most useful when a client can make multiple Faraday requests in rapid succession.
A minimum number of errors e must be observed before an outage is reported.
Leaving these options at the default values of 1.0, 0, and 1 respectively preserves the existing behavior. (Vox Media happens to be running them at 0.1, 10, and 100.)
Additionally,
The exception that triggered an error, if any, is passed to plugin.on_error.
This has helped us understand the nature of the connection difficulties between our applications.
Usefulness
I believe that these configuration options would be useful to most users of breakers, since they will allow the middleware to scale more efficiently as projects' network traffic increases, and debug failures more quickly.
Authorship and rights
I've obtained approval from the legal department at Vox Media to contribute it into the public domain, per this project's license and the required legal notice. I have gotten approval from my co-contributors @stephenmckinney and @thomsbg.
Let us know!
Please reply to this issue to let us know whether this is a contribution that would be considered helpful, and any thoughts you may have. In particular, if you would find only 1 or 2 of these features appropriate, we don't have to submit all four, or we can separate them into multiple PRs.
Thank you!
The text was updated successfully, but these errors were encountered:
Good morning! It's been about a month since submitting this issue, and not hearing any feedback, I'd like to go ahead and submit our work. It may be easier to just see the code. I will plan to submit one PR for the first three behaviors described above, because they are related, and a second smaller one for the fourth behavior. Feedback is still welcome, but if I haven't heard back I'll plan to submit them next week.
Background
Hello! I and two of my (now-former) colleagues at Vox Media, @stephenmckinney and @thomsbg, have added some functionality to
breakers
which we would like to contribute back to this project. Before submitting a PR, I wanted to check with you whether this work would be considered valuable.Because Vox Media wanted to use
breakers
on a high-traffic connection between two of our applications, we wanted to avoid excess traffic to redis, and to avoid declaring an outage for minor glitches. This allowed us to optimize the 99.9% case where both applications are functioning correctly.We have been running our custom version for about four years now, and it runs so reliably that we are barely even aware of it anymore!
Current vs. desired behavior
The current behavior is:
INCR
to redis, to track the exact number of successes.ZRANGE
to redis prior to every request, to check whether there is an outage.The desired behavior (which we've implemented) is to add three configurable parameters p, s, and e:
INCR
is sent only p% of the time, but it increments the stored value by 1/p%. So if p == 5%, the client sends anINCR 20
but it randomly sends it 1 time out of 20 successful requests, cutting write traffic by a factor of 20.ZRANGE
check is only made once every s seconds, in any given ruby process. This is most useful when a client can make multiple Faraday requests in rapid succession.Leaving these options at the default values of 1.0, 0, and 1 respectively preserves the existing behavior. (Vox Media happens to be running them at 0.1, 10, and 100.)
Additionally,
plugin.on_error
.This has helped us understand the nature of the connection difficulties between our applications.
Usefulness
I believe that these configuration options would be useful to most users of
breakers
, since they will allow the middleware to scale more efficiently as projects' network traffic increases, and debug failures more quickly.Authorship and rights
I've obtained approval from the legal department at Vox Media to contribute it into the public domain, per this project's license and the required legal notice. I have gotten approval from my co-contributors @stephenmckinney and @thomsbg.
Let us know!
Please reply to this issue to let us know whether this is a contribution that would be considered helpful, and any thoughts you may have. In particular, if you would find only 1 or 2 of these features appropriate, we don't have to submit all four, or we can separate them into multiple PRs.
Thank you!
The text was updated successfully, but these errors were encountered: