-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow AccessibleByteArrayOutputStream to shrink #346
base: main
Are you sure you want to change the base?
Conversation
AccessibleByteArrayOutputStream currently can only grow or stay the same size. This could be a problem if it is used for a rare large item, but then used only for small items - the buffer would stay large forever, consuming memory. Since an application may hold many of these buffers via their use particularly in Cipher, the memory impact could be significant. This simple shrinkage algorithm shrinks the buffer if it has been 'too large' N times in a row. N starts at 1, and grows exponentially up to a limit of 1024 as re-growths are triggered. N starts low to enable reacting quickly, but as more growths are observed it increases to avoid churn. It is capped to avoid degrading into 'never-shrink' behaviour over the long term. Other ideas were considered, like keeping track of the last N used sizes and making decisions based on those values, but this algorithm has a tiny footprint (2 ints, could be shorts) and low computation overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of supporting shrinkage. I think we should iterate more on the algorithm. Was there impetus behind your decision to look into this? If there was profiling data for us to examine that would help immensely in proving out the assumptions around how this data structure gets used in production.
src/com/amazon/corretto/crypto/provider/AccessibleByteArrayOutputStream.java
Show resolved
Hide resolved
private int timesOversized; | ||
private int timesOversizedThreshold; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the initial value of the threshold? These need to be initialized somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here it would start at 0, which is fine by me, but also any small value would work (1, 2, 4).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually 0 is a mistake here, since I'm doubling it to grow it. It must have been set to 1 in my testing and somehow I messed it up when committing. I'll fix it if we decide to use it.
@@ -122,5 +137,8 @@ private void growCapacity(final int newCapacity, final boolean doNotAllocateMore | |||
final byte[] toZeroize = buf; | |||
buf = tmp; | |||
Arrays.fill(toZeroize, 0, count, (byte) 0); | |||
|
|||
// Every time we need to grow, make it harder to shrink in the future (up to a limit). | |||
timesOversizedThreshold = Math.min(MAX_OVERSIZED_THRESHOLD, timesOversized * 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following the logic here on how this threshold adapts to our workload. Seems like it only ever gets harder over time. That can make sense depending on the call pattern, but for something like ACCP we have to be extremely general purpose and assume nothing about those patterns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true that it only gets harder over time. So really the only difference between workloads is how fast it tends to the limit, and how far it gets. My expectation is that some workloads will never reach it, some will reach it slowly and some will reach it so quickly that they might as well have started there. I agree it needs a bit more exploration, I'll add some thoughts to your top-level comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a bug here too - timesOversized should be timesOversizedThreshold. I had that fixed locally, I must have missed committing my last few changes, sorry.
Thanks @geedo0 for looking at it.
A service I work with switched to re-using their Cipher instances as ThreadLocals. They have a few thousand threads which all occasionally touch crypto, so they now have a few thousand Cipher instances. They saw memory issues because every one of these held onto a 1MB buffer. I haven't confirmed it, but from our knowledge of the service this 1MB buffer is only needed for the largest customer requests, and the typical payload is much smaller. Then I saw the 'TODO: implement shrinkage' in this file and figured why not.
I can look into collecting the payload size distribution in the service I mentioned. Continuing on from the other comment on the algorithm itself - a simpler algorithm ('shrink1') would be a fixed threshold. So if a buffer is too large 1000 times in a row, we shrink it. That should handle the worst case for the status quo which is a single huge payload growing the buffer forever. I don't mind using this if we find it easier to understand and/or the benefits of the more complex algorithm are dubious. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I like the idea. Having said that, since the "shrinking" policy is use-case dependent, it would be nice to introduce a system property where a user can control this behavior with.
Also, the shrinking policy may not have to be as sophisticated as the one proposed in this PR: if a team has metrics around the payload sizes, then the shrinking could be as simple as setting the buffer size to some fixed size.
Having a way to tune and control this behavior would be great. There might be customers who may be impacted once we change this approach and having a way to control it would be great.
Laying down the necessary pluming so that we can add different shrinking policies would be awesome. |
I have updated to allow alternative shrink strategies. I have not implemented anything user-selectable yet, though it should be easy to add. While I understand the desire to allow configuration, I think that by far the most important choice is the default - I expect almost no users to ever give this enough thought to want to change it. |
I like it so far and I'm warm to merging this in as long as we can all agree that it's a net positive with minimal risk of regression. I do have two hesitations though:
That said, I think these two are small risks but a lack of imagination has been the root of many nasty bugs. |
Thanks @geedo0. With a fixed threshold, naturally there is no 'ideal' value, but it's true that we could use more data. It's worth noting that as N increases, we get closer and closer to the existing 'never shrink' strategy. This is partly why I did not include any possibility to reduce the threshold in the adaptive strategy - it's simple to reason about a threshold which never decreases and we can be sure of the worst case. A way to decrease the threshold can make the strategy more responsive, but maybe harder to analyze and may have a worse worst case. I suppose though with a way to reduce the threshold, we could also then allow a higher maximum threshold, since we know that it will adapt as necessary. The fixed value is a compromise between responsiveness and worst-case overhead. I could try a two-way adaptive policy and see how scrutable it is. Risk of thrashing - yes, this is the key concern, the only extra cost users pay compared to before (except for the hopefully negligible logic overhead) is when we shrink and re-grow. The maximum rate this can happen, regardless of workload, is every N uses, where N is 1024 right now. |
It's times like this I wish we could add instrumentation to ACCP, but that's a heated topic and not on the table for the moment. I'll try to garner more input from the team as I don't want to act unilaterally here. In the meantime, if you want to get this closer to merge ready you'll need to run |
Note: this is a draft PR to get feedback on the idea in general. I can add tests etc. if the approach is approved. Let me know if you'd prefer me to create an issue to discuss.
AccessibleByteArrayOutputStream currently can only grow or stay the same size. This could be a problem if it is used for a rare large item, but then used only for small items - the buffer would stay large forever, consuming memory. Since an application may hold many of these buffers via their use particularly in Cipher, the memory impact could be significant.
This simple shrinkage algorithm shrinks the buffer if it has been 'too large' N times in a row. N starts at 1, and grows exponentially up to a limit of 1024 as re-growths are triggered. N starts low to enable reacting quickly, but as more growths are observed it increases to avoid churn. It is capped to avoid degrading into 'never-shrink' behaviour over the long term.
Other ideas were considered, like keeping track of the last N used sizes and making decisions based on those values, but this algorithm has a tiny footprint (2 ints, could be shorts) and low computation overhead.
Here are some example progressions (each column shows the state of the algorithm/buffer at that step, and the action if any taken after that input, e.g. g(128) means grow buffer to 128. s is shrink):
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.