-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default buffer type to 'heap' for 9.0 #16500
base: main
Are you sure you want to change the base?
Changes from 7 commits
dd9be20
a9aa357
19d8586
41376d2
44fdfaf
9bea3e5
a9098a4
31788c3
d64d834
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -118,30 +118,18 @@ To summarize, we have 3 categories of memory usage, where 2 can be limited by th | |||||||||||||
|
||||||||||||||
Keep these memory requirements in mind as you calculate your ideal memory allocation. | ||||||||||||||
|
||||||||||||||
[[reducing-off-heap-usage]] | ||||||||||||||
===== Upcoming changes to Buffer Allocation and Troubleshooting Out of Memory errors | ||||||||||||||
|
||||||||||||||
Plugins such as {agent}, {beats}, TCP, and HTTP inputs, currently default to using direct memory as it tends | ||||||||||||||
to provide better performance, especially when interacting with the network stack. | ||||||||||||||
Under heavy load, namely large number of connections and large messages, the direct memory space can be exhausted and lead to Out of Memory (OOM) errors in off-heap space. | ||||||||||||||
|
||||||||||||||
An off-heap OOM is difficult to debug, so {ls} provides a `pipeline.buffer.type` setting in <<logstash-settings-file>> that lets you control where to allocate memory buffers for plugins that use them. | ||||||||||||||
Currently it is set to `direct` by default, but you can change it to `heap` to use Java heap space instead, which will be become the default in the future. | ||||||||||||||
When set to `heap`, buffer allocations used by plugins are configured to **prefer** the | ||||||||||||||
Java Heap instead of direct memory, as direct memory allocations may still be necessary depending on the plugin. | ||||||||||||||
|
||||||||||||||
When set to "heap", in the event of an out-of-memory, Logstash will produce a heap dump to facilitate debugging. | ||||||||||||||
|
||||||||||||||
It is important to note that the Java heap sizing requirements will be impacted by this change since | ||||||||||||||
allocations that previously resided on the direct memory will use heap instead. | ||||||||||||||
|
||||||||||||||
Performance-wise there shouldn't be a noticeable impact, since while direct memory IO is faster, Logstash Event objects produced by these plugins end up being allocated on the Java Heap, incurring the cost of copying from direct memory to heap memory regardless of the setting. | ||||||||||||||
|
||||||||||||||
[NOTE] | ||||||||||||||
-- | ||||||||||||||
* When you set `pipeline.buffer.type` to `heap`, consider incrementing the Java heap by the | ||||||||||||||
amount of memory that had been reserved for direct space. | ||||||||||||||
-- | ||||||||||||||
[[off-heap-buffers-allocation]] | ||||||||||||||
===== Buffer Allocation types | ||||||||||||||
Input plugins such as {agent}, {beats}, TCP, and HTTP will allocate buffers in Java heap memory to read events from the network. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use present tense instead of future tense when possible. |
||||||||||||||
This is the preferred allocation method as it facilitates debugging memory usage problems (such as leaks and Out of Memory errors) through the analysis of heap dumps. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For clarity and better SEO performance |
||||||||||||||
|
||||||||||||||
Before version 9.0.0, Logstash defaulted to allocate direct memory for this purpose, instead of heap. To re-enable the previous behaviour {ls} provides | ||||||||||||||
andsel marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||
a `pipeline.buffer.type` setting in <<logstash-settings-file>> that lets you control where to allocate | ||||||||||||||
memory buffers for plugins that use them. | ||||||||||||||
|
||||||||||||||
Performance-wise there shouldn't be a noticeable impact in switching to `direct`, since while direct memory IO is faster, | ||||||||||||||
Logstash Event objects produced by these plugins end up being allocated on the Java Heap, | ||||||||||||||
incurring the cost of copying from direct memory to heap memory regardless of the setting. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't we use pooled buffers anyway? I'm under the impression that allocations of buffers may be faster in direct, but that our pool usage is a primary driver for our buffer use being performant. We also know that large buffer allocations can succeed in heap where they would fail in direct because the GC will rearrange the existing object space to ensure that it can allocate while the direct allocation will simply not have an appropriately-sized continuous chunk. I would rather a small performance cost in this case than a crashing OOM :)
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We always use the Netty PooledAllocator, but instead of using direct memory buffers it uses Java heap byte[] There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @yaauie
The is not the allocation that's faster, it's the transfer from OS buffers to direct buffers that's faster, that's usually needed when data is moved around in OS, so from network to filesystem, or network to network. In our case that direct buffers content always flow into Java heap space to inflate Logstash's events, so we loose the benefit of direct buffer, or at least is not so dominant.
andsel marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||
|
||||||||||||||
[[memory-size-calculation]] | ||||||||||||||
===== Memory sizing | ||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.