Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(compression): Make the compression type configurable, default: gzip #3705

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

rnishtala-sumo
Copy link
Contributor

@rnishtala-sumo rnishtala-sumo commented May 14, 2024

  • Make the compression type configurable, default: zstd
  • Regenerated golden files

Checklist

  • Changelog updated or skip changelog label added
  • Template tests added for new features

@rnishtala-sumo rnishtala-sumo requested a review from a team as a code owner May 14, 2024 19:43
@rnishtala-sumo rnishtala-sumo force-pushed the configurable-compression branch 2 times, most recently from 967456e to 5267eca Compare May 14, 2024 20:09
@sumo-drosiek
Copy link
Contributor

Does it need to be configurable? Why customers would need to change it? What are possible values?

Copy link

@swiatekm swiatekm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making it configurable makes sense in case there's a problem with it, and we need to switch back. But in that case, it makes more sense to keep the default as gzip, and first dogfood zstd properly.

@@ -432,6 +432,7 @@ The following table lists the configurable parameters of the Sumo Logic chart an
| `otellogswindows.metrics.enabled` | Enable OpenTelemetry Logs Collector for Windows Nodes metrics | `true` |
| `otellogswindows.serviceLabels` | Add custom labels to OpenTelemetry Logs Collector for Windows Nodes Service | `{}` |
| `otellogswindows.additionalDaemonSets` | OpenTelemetry Logs Collector for Windows Nodes Daemonset per node customization options. See [Best Practices](https://help.sumologic.com/docs/send-data/kubernetes/best-practices/#setting-different-resources-on-different-nodes-for-logs-collector). | `{}` |
| `metadata.compression` | Compression for logs, metrics and events | `zstd` |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about putting this at the top-level, just compression? Right now, it affects the log collector and the event collector, so not just metadata. But it doesn't affect the metrics collector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that's a good point, will make this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved compression to the top level. Also added the config option to the otlphttp exporter of the metrics collector.

@rnishtala-sumo
Copy link
Contributor Author

Does it need to be configurable? Why customers would need to change it? What are possible values?

We want to be able to switch between gzip and zstd.

@sumo-drosiek
Copy link
Contributor

sumo-drosiek commented May 15, 2024

We want to be able to switch between gzip and zstd.

yeah, but does customer need to know what compression do we use?

We can use merge configuration for development/tests and not sure if we need to expose compression to users

@rnishtala-sumo
Copy link
Contributor Author

We can use merge configuration for development/tests and not sure if we need to expose compression to users

I can see some reasons we may want to expose compression as a config option:

  • Since we do expose compression as a config option in the sumologic exporter, if a user wants to set that, this option in the helm chart would allow it.
  • We may want to change it across our dogfooding environments for multiple signals

Looks like reducing egress/ingress costs is driving the push to zstd, the customer may want the same to save on cost.

Is there a negative in exposing compression to users? We could use config merge but if we're going to do this it may be error prone and will require multiple changes across logs/metrics and events

@rnishtala-sumo rnishtala-sumo force-pushed the configurable-compression branch 2 times, most recently from d2fd755 to 049c4a5 Compare May 15, 2024 19:52
@rnishtala-sumo rnishtala-sumo changed the title feat(compression): Make the compression type configurable, default: zstd feat(compression): Make the compression type configurable, default: gzip May 15, 2024
@@ -6,6 +6,9 @@ fullnameOverride: ""
## Use the same namespace as namespaceOverride in 'kube-prometheus-stack.namespaceOverride' if Prometheus setup is also enabled
namespaceOverride: ""

## Compression for logs, metrics and events:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you enumerate what the valid values for this field are, both here and in the README?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added these options - Compression types can be gzip, snappy, zstd or deflate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it could also take "" or none values, but didn't want to call these out, since we always want this enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is one thing I'm concerned with exposing compression configuration, it allows the customer to disable it.

@sumo-drosiek
Copy link
Contributor

sumo-drosiek commented May 16, 2024

Since we do expose compression as a config option in the sumologic exporter, if a user wants to set that, this option in the helm chart would allow it.
Is there a negative in exposing compression to users?

We don't want to expose everything. This is why we stop exposing full configuration. It is easy to expose something but it's hard to revert that change later. I would be rather restrictive in exposing options which are not features or aren't related to resolve issues. Compression is strictly technical topic, and I would be very careful with exposing such options. We can always expose it in the future if there will be business need for that

We may want to change it across our dogfooding environments for multiple signals

It's development and as I understand, we want to change it to zstd, so we can test the stability of it and then use as defaults in the feature?

Looks like reducing egress/ingress costs is driving the push to zstd, the customer may want the same to save on cost.

Yes, so if zstd is stable, we may just change it in helm chart and release it. Customers don't need to know that we had gzip and now we have zstd.


Technically PR looks ok, but I'm not convinced about the change itself 🙈

@rnishtala-sumo
Copy link
Contributor Author

Compression is strictly technical topic, and I would be very careful with exposing such options. We can always expose it in the future if there will be business need for that

My understanding is that this is driven by business need. Also, other projects like nginx do make compression configurable

Yes, so if zstd is stable, we may just change it in helm chart and release it. Customers don't need to know that we had gzip and now we have zstd

I think all compression options currently have the same stability.

@rnishtala-sumo rnishtala-sumo force-pushed the configurable-compression branch 2 times, most recently from 013695f to eb038e1 Compare May 16, 2024 16:14
@echlebek
Copy link
Contributor

I do think Dominik has a good point here, and if we know that zstd is the best overall, then maybe it shouldn't be configurable.

Also, I think something to consider is that a potentially more impactful configuration feature in terms of resource consumption could be the compression level. That's what is going to influence CPU use more than algorithm choice for encoding, if I'm not mistaken.

@rnishtala-sumo
Copy link
Contributor Author

ok, I'll dogfood zstd using config merge on k8s collection as the next step. If there aren't issues, will consider switching over to zstd.

@rnishtala-sumo
Copy link
Contributor Author

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request contains invalid labels. Please remove all of the following labels: ['do-not-merge/hold']

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants