Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default buffer type to 'heap' for 9.0 #16500

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
4 changes: 2 additions & 2 deletions config/logstash.yml
Original file line number Diff line number Diff line change
Expand Up @@ -331,8 +331,8 @@
# pipeline.separate_logs: false
#
# Determine where to allocate memory buffers, for plugins that leverage them.
# Default to direct, optionally can be switched to heap to select Java heap space.
# pipeline.buffer.type: direct
# Defaults to heap, optionally can be switched to direct to select direct memory space.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Defaults to heap, optionally can be switched to direct to select direct memory space.
# Defaults to heap,but can be switched to direct if you prefer using direct memory space instead.

# pipeline.buffer.type: heap
#
# ------------ X-Pack Settings (not applicable for OSS build)--------------
#
Expand Down
36 changes: 12 additions & 24 deletions docs/static/config-details.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -118,30 +118,18 @@ To summarize, we have 3 categories of memory usage, where 2 can be limited by th

Keep these memory requirements in mind as you calculate your ideal memory allocation.

[[reducing-off-heap-usage]]
===== Upcoming changes to Buffer Allocation and Troubleshooting Out of Memory errors

Plugins such as {agent}, {beats}, TCP, and HTTP inputs, currently default to using direct memory as it tends
to provide better performance, especially when interacting with the network stack.
Under heavy load, namely large number of connections and large messages, the direct memory space can be exhausted and lead to Out of Memory (OOM) errors in off-heap space.

An off-heap OOM is difficult to debug, so {ls} provides a `pipeline.buffer.type` setting in <<logstash-settings-file>> that lets you control where to allocate memory buffers for plugins that use them.
Currently it is set to `direct` by default, but you can change it to `heap` to use Java heap space instead, which will be become the default in the future.
When set to `heap`, buffer allocations used by plugins are configured to **prefer** the
Java Heap instead of direct memory, as direct memory allocations may still be necessary depending on the plugin.

When set to "heap", in the event of an out-of-memory, Logstash will produce a heap dump to facilitate debugging.

It is important to note that the Java heap sizing requirements will be impacted by this change since
allocations that previously resided on the direct memory will use heap instead.

Performance-wise there shouldn't be a noticeable impact, since while direct memory IO is faster, Logstash Event objects produced by these plugins end up being allocated on the Java Heap, incurring the cost of copying from direct memory to heap memory regardless of the setting.

[NOTE]
--
* When you set `pipeline.buffer.type` to `heap`, consider incrementing the Java heap by the
amount of memory that had been reserved for direct space.
--
[[off-heap-buffers-allocation]]
===== Buffer Allocation types
Input plugins such as {agent}, {beats}, TCP, and HTTP will allocate buffers in Java heap memory to read events from the network.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Input plugins such as {agent}, {beats}, TCP, and HTTP will allocate buffers in Java heap memory to read events from the network.
Input plugins such as {agent}, {beats}, TCP, and HTTP allocate buffers in Java heap memory to read events from the network.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use present tense instead of future tense when possible.

This is the preferred allocation method as it facilitates debugging memory usage problems (such as leaks and Out of Memory errors) through the analysis of heap dumps.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This is the preferred allocation method as it facilitates debugging memory usage problems (such as leaks and Out of Memory errors) through the analysis of heap dumps.
Heap memory is the preferred allocation method, as it facilitates debugging memory usage problems (such as leaks and Out of Memory errors) through the analysis of heap dumps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity and better SEO performance


Before version 9.0.0, Logstash defaulted to allocate direct memory for this purpose instead of heap. To re-enable the previous behaviour {ls} provides
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Before version 9.0.0, Logstash defaulted to allocate direct memory for this purpose instead of heap. To re-enable the previous behaviour {ls} provides
Before version 9.0.0, {ls} defaulted to direct memory instead of heap for this purpose. To re-enable the previous behavior {ls} provides

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rearrange the words with the intention of simplifying the sentence. Please make sure that I didn't change the meaning.
Also, Elastic standard is US spelling.

a `pipeline.buffer.type` setting in <<logstash-settings-file>> that lets you control where to allocate
memory buffers for plugins that use them.

There shouldn't be a noticeable performance impact in switching between `direct` and `heap`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
There shouldn't be a noticeable performance impact in switching between `direct` and `heap`.
Performance should not be noticeably affected if you switch between `direct` and `heap`.

While copying bytes from OS buffers to direct memory buffers is faster, Logstash Event objects produced by these plugins end up
being allocated on the Java Heap incurring the cost of copying from direct memory to heap memory, regardless of the setting.
Comment on lines +131 to +132
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
While copying bytes from OS buffers to direct memory buffers is faster, Logstash Event objects produced by these plugins end up
being allocated on the Java Heap incurring the cost of copying from direct memory to heap memory, regardless of the setting.
While copying bytes from OS buffers to direct memory buffers is faster, {ls} Event objects produced by these plugins are allocated on the Java Heap, incurring the cost of copying from direct memory to heap memory, regardless of the setting.


[[memory-size-calculation]]
===== Memory sizing
Expand Down
4 changes: 2 additions & 2 deletions docs/static/settings-file.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -369,6 +369,6 @@ Setting this flag to `warn` is deprecated and will be removed in a future releas

| `pipeline.buffer.type`
| Determine where to allocate memory buffers, for plugins that leverage them.
Currently defaults to `direct` but can be switched to `heap` to select Java heap space, which will become the default in the future.
| `direct` Check out <<reducing-off-heap-usage>> for more info.
Defaults to `heap` but can be switched to `direct` to instruct Logstash to prefer allocation of buffers in direct memory.
| `heap` Check out <<off-heap-buffers-allocation>> for more info.
|=======================================================================
2 changes: 1 addition & 1 deletion logstash-core/lib/logstash/environment.rb
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ module Environment
Setting::String.new("keystore.classname", "org.logstash.secret.store.backend.JavaKeyStore"),
Setting::String.new("keystore.file", ::File.join(::File.join(LogStash::Environment::LOGSTASH_HOME, "config"), "logstash.keystore"), false), # will be populated on
Setting::NullableString.new("monitoring.cluster_uuid"),
Setting::String.new("pipeline.buffer.type", "direct", true, ["direct", "heap"])
Setting::String.new("pipeline.buffer.type", "heap", true, ["direct", "heap"])
# post_process
].each {|setting| SETTINGS.register(setting) }

Expand Down
Loading