-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default buffer type to 'heap' for 9.0 #16500
base: main
Are you sure you want to change the base?
Conversation
a611f37
to
2741fb1
Compare
54d1315
to
6f4878e
Compare
docs/static/config-details.asciidoc
Outdated
@@ -126,7 +126,7 @@ to provide better performance, especially when interacting with the network stac | |||
Under heavy load, namely large number of connections and large messages, the direct memory space can be exhausted and lead to Out of Memory (OOM) errors in off-heap space. | |||
|
|||
An off-heap OOM is difficult to debug, so {ls} provides a `pipeline.buffer.type` setting in <<logstash-settings-file>> that lets you control where to allocate memory buffers for plugins that use them. | |||
Currently it is set to `direct` by default, but you can change it to `heap` to use Java heap space instead, which will be become the default in the future. | |||
Currently it is set to `heap` by default, but you can change it to `direct` to use direct memory space instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is important to note that the Java heap sizing requirements will be impacted by this change since
allocations that previously resided on the direct memory will use heap instead.
Performance-wise there shouldn't be a noticeable impact, since while direct memory IO is faster, Logstash Event objects produced by these plugins end up being allocated on the Java Heap, incurring the cost of copying from direct memory to heap memory regardless of the setting.
These must be changed as well, same with:
* When you set `pipeline.buffer.type` to `heap`, consider incrementing the Java heap by the
amount of memory that had been reserved for direct space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please review the rest of the docs on the config-details asciidoc as it assumes direct memory is the default, that heap will be come the default, and that the user will be changing it to heap (which doesn't make sense for someone starting fresh with 9.x)
docs/static/config-details.asciidoc
Outdated
Performance-wise there shouldn't be a noticeable impact in switching to `direct`, since while direct memory IO is faster, | ||
Logstash Event objects produced by these plugins end up being allocated on the Java Heap, | ||
incurring the cost of copying from direct memory to heap memory regardless of the setting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we use pooled buffers anyway? I'm under the impression that allocations of buffers may be faster in direct, but that our pool usage is a primary driver for our buffer use being performant. We also know that large buffer allocations can succeed in heap where they would fail in direct because the GC will rearrange the existing object space to ensure that it can allocate while the direct allocation will simply not have an appropriately-sized continuous chunk. I would rather a small performance cost in this case than a crashing OOM :)
Performance-wise there shouldn't be a noticeable impact in switching to `direct`, since while direct memory IO is faster, | |
Logstash Event objects produced by these plugins end up being allocated on the Java Heap, | |
incurring the cost of copying from direct memory to heap memory regardless of the setting. | |
Performance-wise there shouldn't be a noticeable impact in switching to `direct`. | |
While allocating direct memory for an individual buffer is faster, these plugins use buffer pools to reduce allocations, and heap-managed allocations are significantly safer. | |
Logstash Event objects produced by these plugins also end up being allocated on the Java Heap, incurring the cost of copying from buffers to heap memory regardless of the setting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We always use the Netty PooledAllocator, but instead of using direct memory buffers it uses Java heap byte[]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @yaauie
While allocating direct memory for an individual buffer is faster,
The is not the allocation that's faster, it's the transfer from OS buffers to direct buffers that's faster, that's usually needed when data is moved around in OS, so from network to filesystem, or network to network. In our case that direct buffers content always flow into Java heap space to inflate Logstash's events, so we loose the benefit of direct buffer, or at least is not so dominant.
…er binding iff the webserver setting is enabled
…escribe the possibility to switch back to direct allocation in case the user want the previous behaviour
0183fd0
to
44fdfaf
Compare
Co-authored-by: João Duarte <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor tweaks, otherwise LGTM, but I'd like @karenzone's input here as well.
Co-authored-by: João Duarte <[email protected]>
Quality Gate passedIssues Measures |
💚 Build Succeeded
History
cc @andsel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some suggestions for your consideration. Please LMKWYT.
-- | ||
[[off-heap-buffers-allocation]] | ||
===== Buffer Allocation types | ||
Input plugins such as {agent}, {beats}, TCP, and HTTP will allocate buffers in Java heap memory to read events from the network. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Input plugins such as {agent}, {beats}, TCP, and HTTP will allocate buffers in Java heap memory to read events from the network. | |
Input plugins such as {agent}, {beats}, TCP, and HTTP allocate buffers in Java heap memory to read events from the network. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use present tense instead of future tense when possible.
[[off-heap-buffers-allocation]] | ||
===== Buffer Allocation types | ||
Input plugins such as {agent}, {beats}, TCP, and HTTP will allocate buffers in Java heap memory to read events from the network. | ||
This is the preferred allocation method as it facilitates debugging memory usage problems (such as leaks and Out of Memory errors) through the analysis of heap dumps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the preferred allocation method as it facilitates debugging memory usage problems (such as leaks and Out of Memory errors) through the analysis of heap dumps. | |
Heap memory is the preferred allocation method, as it facilitates debugging memory usage problems (such as leaks and Out of Memory errors) through the analysis of heap dumps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity and better SEO performance
Input plugins such as {agent}, {beats}, TCP, and HTTP will allocate buffers in Java heap memory to read events from the network. | ||
This is the preferred allocation method as it facilitates debugging memory usage problems (such as leaks and Out of Memory errors) through the analysis of heap dumps. | ||
|
||
Before version 9.0.0, Logstash defaulted to allocate direct memory for this purpose instead of heap. To re-enable the previous behaviour {ls} provides |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before version 9.0.0, Logstash defaulted to allocate direct memory for this purpose instead of heap. To re-enable the previous behaviour {ls} provides | |
Before version 9.0.0, {ls} defaulted to direct memory instead of heap for this purpose. To re-enable the previous behavior {ls} provides |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rearrange the words with the intention of simplifying the sentence. Please make sure that I didn't change the meaning.
Also, Elastic standard is US spelling.
a `pipeline.buffer.type` setting in <<logstash-settings-file>> that lets you control where to allocate | ||
memory buffers for plugins that use them. | ||
|
||
There shouldn't be a noticeable performance impact in switching between `direct` and `heap`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There shouldn't be a noticeable performance impact in switching between `direct` and `heap`. | |
Performance should not be noticeably affected if you switch between `direct` and `heap`. |
While copying bytes from OS buffers to direct memory buffers is faster, Logstash Event objects produced by these plugins end up | ||
being allocated on the Java Heap incurring the cost of copying from direct memory to heap memory, regardless of the setting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While copying bytes from OS buffers to direct memory buffers is faster, Logstash Event objects produced by these plugins end up | |
being allocated on the Java Heap incurring the cost of copying from direct memory to heap memory, regardless of the setting. | |
While copying bytes from OS buffers to direct memory buffers is faster, {ls} Event objects produced by these plugins are allocated on the Java Heap, incurring the cost of copying from direct memory to heap memory, regardless of the setting. |
@@ -331,8 +331,8 @@ | |||
# pipeline.separate_logs: false | |||
# | |||
# Determine where to allocate memory buffers, for plugins that leverage them. | |||
# Default to direct, optionally can be switched to heap to select Java heap space. | |||
# pipeline.buffer.type: direct | |||
# Defaults to heap, optionally can be switched to direct to select direct memory space. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Defaults to heap, optionally can be switched to direct to select direct memory space. | |
# Defaults to heap,but can be switched to direct if you prefer using direct memory space instead. |
Release notes
Switch the default value of
pipeline.buffer.type
to use the heap memory instead of direct one.What does this PR do?
Change the default value of the setting
pipeline.buffer.type
fromdirect
toheap
and update consequently the documentation.Why is it important/What is the impact to the user?
It's part of the work to make more explicit the usage of memory used by Netty based plugins. Using heap instead of direct for Netty buffers, make easier the debug of memory issues at the cost of bigger heap size.
Checklist
[ ] I have commented my code, particularly in hard-to-understand areas[ ] I have added tests that prove my fix is effective or that my feature worksAuthor's Checklist
How to test this PR locally
Related issues
pipeline.buffer.type
toheap
by default #16353Use cases
As a developer of Netty based plugin I want to have better insight in memory usage patterns.