Default buffer type to 'heap' for 9.0 #16500

andsel · 2024-10-02T14:26:06Z

Release notes

Switch the default value of pipeline.buffer.type to use the heap memory instead of direct one.

What does this PR do?

Change the default value of the setting pipeline.buffer.type from direct to heap and update consequently the documentation.

Why is it important/What is the impact to the user?

It's part of the work to make more explicit the usage of memory used by Netty based plugins. Using heap instead of direct for Netty buffers, make easier the debug of memory issues at the cost of bigger heap size.

Checklist

My code follows the style guidelines of this project
~~[ ] I have commented my code, particularly in hard-to-understand areas~~
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files (and/or docker env variables)
~~[ ] I have added tests that prove my fix is effective or that my feature works~~

Author's Checklist

[ ]

How to test this PR locally

Related issues

Relates Set pipeline.buffer.type to heap by default #16353

Use cases

As a developer of Netty based plugin I want to have better insight in memory usage patterns.

jsvd · 2024-10-11T10:59:54Z

docs/static/config-details.asciidoc

@@ -126,7 +126,7 @@ to provide better performance, especially when interacting with the network stac
 Under heavy load, namely large number of connections and large messages, the direct memory space can be exhausted and lead to Out of Memory (OOM) errors in off-heap space.

 An off-heap OOM is difficult to debug, so {ls} provides a `pipeline.buffer.type` setting in <<logstash-settings-file>> that lets you control where to allocate memory buffers for plugins that use them. 
-Currently it is set to `direct` by default, but you can change it to `heap` to use Java heap space instead, which will be become the default in the future.
+Currently it is set to `heap` by default, but you can change it to `direct` to use direct memory space instead.


It is important to note that the Java heap sizing requirements will be impacted by this change since allocations that previously resided on the direct memory will use heap instead. Performance-wise there shouldn't be a noticeable impact, since while direct memory IO is faster, Logstash Event objects produced by these plugins end up being allocated on the Java Heap, incurring the cost of copying from direct memory to heap memory regardless of the setting.

These must be changed as well, same with:

* When you set `pipeline.buffer.type` to `heap`, consider incrementing the Java heap by the amount of memory that had been reserved for direct space.

jsvd

Please review the rest of the docs on the config-details asciidoc as it assumes direct memory is the default, that heap will be come the default, and that the user will be changing it to heap (which doesn't make sense for someone starting fresh with 9.x)

docs/static/config-details.asciidoc

yaauie · 2024-10-11T14:25:27Z

docs/static/config-details.asciidoc

+Performance-wise there shouldn't be a noticeable impact in switching to `direct`, since while direct memory IO is faster,
+Logstash Event objects produced by these plugins end up being allocated on the Java Heap,
+incurring the cost of copying from direct memory to heap memory regardless of the setting.


Don't we use pooled buffers anyway? I'm under the impression that allocations of buffers may be faster in direct, but that our pool usage is a primary driver for our buffer use being performant. We also know that large buffer allocations can succeed in heap where they would fail in direct because the GC will rearrange the existing object space to ensure that it can allocate while the direct allocation will simply not have an appropriately-sized continuous chunk. I would rather a small performance cost in this case than a crashing OOM :)

Suggested change

Performance-wise there shouldn't be a noticeable impact in switching to `direct`, since while direct memory IO is faster,

Logstash Event objects produced by these plugins end up being allocated on the Java Heap,

incurring the cost of copying from direct memory to heap memory regardless of the setting.

Performance-wise there shouldn't be a noticeable impact in switching to `direct`.

While allocating direct memory for an individual buffer is faster, these plugins use buffer pools to reduce allocations, and heap-managed allocations are significantly safer.

Logstash Event objects produced by these plugins also end up being allocated on the Java Heap, incurring the cost of copying from buffers to heap memory regardless of the setting.

We always use the Netty PooledAllocator, but instead of using direct memory buffers it uses Java heap byte[]

Hi @yaauie

While allocating direct memory for an individual buffer is faster,

The is not the allocation that's faster, it's the transfer from OS buffers to direct buffers that's faster, that's usually needed when data is moved around in OS, so from network to filesystem, or network to network. In our case that direct buffers content always flow into Java heap space to inflate Logstash's events, so we loose the benefit of direct buffer, or at least is not so dominant.

…er binding iff the webserver setting is enabled

…escribe the possibility to switch back to direct allocation in case the user want the previous behaviour

config/logstash.yml

docs/static/config-details.asciidoc

docs/static/settings-file.asciidoc

Co-authored-by: João Duarte <[email protected]>

jsvd

Minor tweaks, otherwise LGTM, but I'd like @karenzone's input here as well.

docs/static/config-details.asciidoc

Co-authored-by: João Duarte <[email protected]>

elastic-sonarqube · 2024-10-18T07:12:32Z

Quality Gate passed

Issues
0 New issues
0 Fixed issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarQube

elasticmachine · 2024-10-18T07:22:57Z

💚 Build Succeeded

Buildkite Build
Commit: d64d834

History

💚 Build #1701 succeeded a9098a4
💛 Build #1672 was flaky 44fdfaf
💛 Build #1650 was flaky d851770
💚 Build #1647 succeeded 6f4878e
💚 Build #1619 succeeded 54d1315
💛 Build #1615 was flaky be295b3

cc @andsel

karenzone

Left some suggestions for your consideration. Please LMKWYT.

karenzone · 2024-10-31T22:57:33Z

docs/static/config-details.asciidoc

--
+[[off-heap-buffers-allocation]]
+===== Buffer Allocation types
+Input plugins such as {agent}, {beats}, TCP, and HTTP will allocate buffers in Java heap memory to read events from the network.


Suggested change

Input plugins such as {agent}, {beats}, TCP, and HTTP will allocate buffers in Java heap memory to read events from the network.

Input plugins such as {agent}, {beats}, TCP, and HTTP allocate buffers in Java heap memory to read events from the network.

Use present tense instead of future tense when possible.

karenzone · 2024-10-31T23:00:00Z

docs/static/config-details.asciidoc

+[[off-heap-buffers-allocation]]
+===== Buffer Allocation types
+Input plugins such as {agent}, {beats}, TCP, and HTTP will allocate buffers in Java heap memory to read events from the network.
+This is the preferred allocation method as it facilitates debugging memory usage problems (such as leaks and Out of Memory errors) through the analysis of heap dumps.


Suggested change

This is the preferred allocation method as it facilitates debugging memory usage problems (such as leaks and Out of Memory errors) through the analysis of heap dumps.

Heap memory is the preferred allocation method, as it facilitates debugging memory usage problems (such as leaks and Out of Memory errors) through the analysis of heap dumps.

For clarity and better SEO performance

karenzone · 2024-10-31T23:02:57Z

docs/static/config-details.asciidoc

+Input plugins such as {agent}, {beats}, TCP, and HTTP will allocate buffers in Java heap memory to read events from the network.
+This is the preferred allocation method as it facilitates debugging memory usage problems (such as leaks and Out of Memory errors) through the analysis of heap dumps.
+
+Before version 9.0.0, Logstash defaulted to allocate direct memory for this purpose instead of heap. To re-enable the previous behaviour {ls} provides


Suggested change

Before version 9.0.0, Logstash defaulted to allocate direct memory for this purpose instead of heap. To re-enable the previous behaviour {ls} provides

Before version 9.0.0, {ls} defaulted to direct memory instead of heap for this purpose. To re-enable the previous behavior {ls} provides

I rearrange the words with the intention of simplifying the sentence. Please make sure that I didn't change the meaning.
Also, Elastic standard is US spelling.

karenzone · 2024-10-31T23:06:54Z

docs/static/config-details.asciidoc

+a `pipeline.buffer.type` setting in <<logstash-settings-file>> that lets you control where to allocate
+memory buffers for plugins that use them.
+
+There shouldn't be a noticeable performance impact in switching between `direct` and `heap`. 


Suggested change

There shouldn't be a noticeable performance impact in switching between `direct` and `heap`.

Performance should not be noticeably affected if you switch between `direct` and `heap`.

karenzone · 2024-10-31T23:08:03Z

docs/static/config-details.asciidoc

+While copying bytes from OS buffers to direct memory buffers is faster, Logstash Event objects produced by these plugins end up 
+being allocated on the Java Heap incurring the cost of copying from direct memory to heap memory, regardless of the setting.


Suggested change

While copying bytes from OS buffers to direct memory buffers is faster, Logstash Event objects produced by these plugins end up

being allocated on the Java Heap incurring the cost of copying from direct memory to heap memory, regardless of the setting.

While copying bytes from OS buffers to direct memory buffers is faster, {ls} Event objects produced by these plugins are allocated on the Java Heap, incurring the cost of copying from direct memory to heap memory, regardless of the setting.

karenzone · 2024-10-31T23:09:04Z

config/logstash.yml

@@ -331,8 +331,8 @@
 # pipeline.separate_logs: false
 #
 # Determine where to allocate memory buffers, for plugins that leverage them.
-# Default to direct, optionally can be switched to heap to select Java heap space.
-# pipeline.buffer.type: direct
+# Defaults to heap, optionally can be switched to direct to select direct memory space.


Suggested change

# Defaults to heap, optionally can be switched to direct to select direct memory space.

# Defaults to heap,but can be switched to direct if you prefer using direct memory space instead.

andsel added the v9.0.0 label Oct 2, 2024

andsel mentioned this pull request Oct 2, 2024

Set pipeline.buffer.type to heap by default #16353

Open

andsel linked an issue Oct 2, 2024 that may be closed by this pull request

Set pipeline.buffer.type to heap by default #16353

Open

jsvd added the status:work-in-progress label Oct 2, 2024

andsel force-pushed the feature/default_buffer_type_to_heap branch from a611f37 to 2741fb1 Compare October 4, 2024 12:56

andsel mentioned this pull request Oct 10, 2024

[test] Fix xpack test to check for http_address stats only if the webserver is enabled #16525

Merged

2 tasks

github-actions bot mentioned this pull request Oct 10, 2024

Backport PR #16525 to 8.x: [test] Fix xpack test to check for http_address stats only if the webserver is enabled #16531

Merged

andsel force-pushed the feature/default_buffer_type_to_heap branch 2 times, most recently from 54d1315 to 6f4878e Compare October 11, 2024 09:32

andsel self-assigned this Oct 11, 2024

andsel added the enhancement label Oct 11, 2024

andsel marked this pull request as ready for review October 11, 2024 10:15

jsvd reviewed Oct 11, 2024

View reviewed changes

jsvd requested changes Oct 11, 2024

View reviewed changes

jsvd added status:changes-requested and removed status:work-in-progress labels Oct 11, 2024

yaauie reviewed Oct 11, 2024

View reviewed changes

andsel added 5 commits October 14, 2024 17:12

Default buffer type to 'heap'

dd9be20

[Test] Fixed x-pack tests of StatsEventFactory to verify the web serv…

a9aa357

…er binding iff the webserver setting is enabled

Reverted changes covered by elastic#16525

19d8586

Re-add missed during rebase

41376d2

Reworked the section that describe the off-heap allocation, to just d…

44fdfaf

…escribe the possibility to switch back to direct allocation in case the user want the previous behaviour

andsel force-pushed the feature/default_buffer_type_to_heap branch from 0183fd0 to 44fdfaf Compare October 14, 2024 15:12

andsel requested review from jsvd and yaauie October 15, 2024 07:03

karenzone mentioned this pull request Oct 15, 2024

Doc: Update 8.15.0 release notes to future proof link #16554

Merged

jsvd changed the title ~~Default buffer type to 'heap' fo 9.0~~ Default buffer type to 'heap' for 9.0 Oct 16, 2024

jsvd reviewed Oct 16, 2024

View reviewed changes

andsel and others added 2 commits October 17, 2024 09:12

Better wording

9bea3e5

Co-authored-by: João Duarte <[email protected]>

Merged multiple suggestion on a phrase to shape it better

a9098a4

andsel requested a review from jsvd October 17, 2024 08:19

jsvd approved these changes Oct 17, 2024

View reviewed changes

docs/static/config-details.asciidoc Outdated Show resolved Hide resolved

docs/static/config-details.asciidoc Outdated Show resolved Hide resolved

jsvd requested a review from karenzone October 17, 2024 13:55

jsvd added status:approved and removed status:changes-requested labels Oct 17, 2024

andsel and others added 2 commits October 18, 2024 08:52

Update docs/static/config-details.asciidoc

31788c3

Co-authored-by: João Duarte <[email protected]>

Applied Joao rephrasing suggestion

d64d834

karenzone reviewed Oct 31, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default buffer type to 'heap' for 9.0 #16500

Default buffer type to 'heap' for 9.0 #16500

andsel commented Oct 2, 2024 •

edited

Loading

jsvd Oct 11, 2024

jsvd left a comment

yaauie Oct 11, 2024

andsel Oct 11, 2024

andsel Oct 15, 2024

jsvd left a comment

elastic-sonarqube bot commented Oct 18, 2024

elasticmachine commented Oct 18, 2024

karenzone left a comment

karenzone Oct 31, 2024

karenzone Oct 31, 2024

karenzone Oct 31, 2024

karenzone Oct 31, 2024

karenzone Oct 31, 2024

karenzone Oct 31, 2024

karenzone Oct 31, 2024

karenzone Oct 31, 2024

karenzone Oct 31, 2024

	Input plugins such as {agent}, {beats}, TCP, and HTTP will allocate buffers in Java heap memory to read events from the network.
	Input plugins such as {agent}, {beats}, TCP, and HTTP allocate buffers in Java heap memory to read events from the network.

	This is the preferred allocation method as it facilitates debugging memory usage problems (such as leaks and Out of Memory errors) through the analysis of heap dumps.
	Heap memory is the preferred allocation method, as it facilitates debugging memory usage problems (such as leaks and Out of Memory errors) through the analysis of heap dumps.

	Before version 9.0.0, Logstash defaulted to allocate direct memory for this purpose instead of heap. To re-enable the previous behaviour {ls} provides
	Before version 9.0.0, {ls} defaulted to direct memory instead of heap for this purpose. To re-enable the previous behavior {ls} provides

	There shouldn't be a noticeable performance impact in switching between `direct` and `heap`.
	Performance should not be noticeably affected if you switch between `direct` and `heap`.

		While copying bytes from OS buffers to direct memory buffers is faster, Logstash Event objects produced by these plugins end up
		being allocated on the Java Heap incurring the cost of copying from direct memory to heap memory, regardless of the setting.

	# Defaults to heap, optionally can be switched to direct to select direct memory space.
	# Defaults to heap,but can be switched to direct if you prefer using direct memory space instead.

Default buffer type to 'heap' for 9.0 #16500

Are you sure you want to change the base?

Default buffer type to 'heap' for 9.0 #16500

Conversation

andsel commented Oct 2, 2024 • edited Loading

Release notes

What does this PR do?

Why is it important/What is the impact to the user?

Checklist

Author's Checklist

How to test this PR locally

Related issues

Use cases

Choose a reason for hiding this comment

jsvd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsvd left a comment

Choose a reason for hiding this comment

elastic-sonarqube bot commented Oct 18, 2024

Quality Gate passed

elasticmachine commented Oct 18, 2024

💚 Build Succeeded

History

karenzone left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andsel commented Oct 2, 2024 •

edited

Loading