-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[filebeat] Elasticsearch state storage for httpjson and cel inputs #41446
Conversation
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
@belimawr @cmacknz (or whoever wants/have time to be involved)
|
@leehinman I'd appreciate a review here to make sure this can co-exist with Beats receivers in agent since that would be the long term way we plan to run agentless inputs. |
} | ||
|
||
func (s *store) get(key string, to interface{}) error { | ||
status, data, err := s.cli.Request("GET", fmt.Sprintf("/%s/%s/%s", s.index, docType, url.QueryEscape(key)), "", nil, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want to instrument this function with its own metrics, so we can observe the frequency of 429s, or just instrument it with tracing since that'll always be available where we plan to use it. There needs to be a way to tell if this is working properly.
This can be done as a follow up to this PR however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, the new test looks good.
Biggest outstanding thing I see is some ability to tell how well this is working and if it is overloading ES. We can address this as a follow up.
@@ -81,7 +82,7 @@ type Filebeat struct { | |||
type PluginFactory func(beat.Info, *logp.Logger, StateStore) []v2.Plugin | |||
|
|||
type StateStore interface { | |||
Access() (*statestore.Store, error) | |||
Access(typ string) (*statestore.Store, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Godoc for this parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest something like
// Access returns the storage registry depending on the type.
// The value of typ is expected to have been obtained from
// [cursor.InputManager.Type] and represents the input type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM is limited to simple changes at x-pack/filebeat/input/awss3/
.
Nit: calling a method with an empty string (stateStore.Access("")
) feels slightly weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving it from obs-infraobs side while considering the change done to the salesforce input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewing (minimal) changes in x-pack/filebeat on behalf of obs-infraobs-integrations.
looks fine to me.
run docs-build |
/test |
…41446) This enables Elasticsearch as State Store Backend for Security Integrations for the Agentless solution. The scope of this change was narrowed down to supporting only `httpjson` inputs in order to support Okta integration for the initial release. All the other integrations inputs still use the file storage as before. This is a short term solution for the state storage for k8s. The feature currently can only be enabled with the `AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES` env var. The existing code relied on the inputs state storage to be fully configurable before the main beat managers runs. The change delays the configuration of `httpjson` input to the time when the actual configuration is received from the Agent. Example of the state storage index content for Okta integration: ``` { "took": 6, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959", "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs", "_seq_no": 39, "_primary_term": 1, "_score": 1, "_source": { "v": { "ttl": 1800000000000, "updated": "2024-10-24T20:21:22.032Z", "cursor": { "published": "2024-10-24T20:19:53.542Z" } } } } ] } } ``` The naming convention for all state store is `agentless-state-<input id>`, since the expectation for agentless we would have only one agent per policy and the agents are ephemeral. Closes https://github.com/elastic/security-team/issues/11101 Co-authored-by: Orestis Floros <[email protected]> (cherry picked from commit 8180f23) # Conflicts: # filebeat/beater/filebeat.go
…lastic#41446) This enables Elasticsearch as State Store Backend for Security Integrations for the Agentless solution. The scope of this change was narrowed down to supporting only `httpjson` inputs in order to support Okta integration for the initial release. All the other integrations inputs still use the file storage as before. This is a short term solution for the state storage for k8s. The feature currently can only be enabled with the `AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES` env var. The existing code relied on the inputs state storage to be fully configurable before the main beat managers runs. The change delays the configuration of `httpjson` input to the time when the actual configuration is received from the Agent. Example of the state storage index content for Okta integration: ``` { "took": 6, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959", "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs", "_seq_no": 39, "_primary_term": 1, "_score": 1, "_source": { "v": { "ttl": 1800000000000, "updated": "2024-10-24T20:21:22.032Z", "cursor": { "published": "2024-10-24T20:19:53.542Z" } } } } ] } } ``` The naming convention for all state store is `agentless-state-<input id>`, since the expectation for agentless we would have only one agent per policy and the agents are ephemeral. Closes https://github.com/elastic/security-team/issues/11101 Co-authored-by: Orestis Floros <[email protected]> (cherry picked from commit 8180f23) # Conflicts: # filebeat/beater/filebeat.go
…41446) This enables Elasticsearch as State Store Backend for Security Integrations for the Agentless solution. The scope of this change was narrowed down to supporting only `httpjson` inputs in order to support Okta integration for the initial release. All the other integrations inputs still use the file storage as before. This is a short term solution for the state storage for k8s. The feature currently can only be enabled with the `AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES` env var. The existing code relied on the inputs state storage to be fully configurable before the main beat managers runs. The change delays the configuration of `httpjson` input to the time when the actual configuration is received from the Agent. Example of the state storage index content for Okta integration: ``` { "took": 6, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959", "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs", "_seq_no": 39, "_primary_term": 1, "_score": 1, "_source": { "v": { "ttl": 1800000000000, "updated": "2024-10-24T20:21:22.032Z", "cursor": { "published": "2024-10-24T20:19:53.542Z" } } } } ] } } ``` The naming convention for all state store is `agentless-state-<input id>`, since the expectation for agentless we would have only one agent per policy and the agents are ephemeral. Closes elastic/security-team#11101 Co-authored-by: Aleksandr Maus <[email protected]> Co-authored-by: Orestis Floros <[email protected]> (cherry picked from commit 8180f23)
…pjson and cel inputs (#42451) This enables Elasticsearch as State Store Backend for Security Integrations for the Agentless solution. The scope of this change was narrowed down to supporting only `httpjson` inputs in order to support Okta integration for the initial release. All the other integrations inputs still use the file storage as before. This is a short term solution for the state storage for k8s. The feature currently can only be enabled with the `AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES` env var. The existing code relied on the inputs state storage to be fully configurable before the main beat managers runs. The change delays the configuration of `httpjson` input to the time when the actual configuration is received from the Agent. Example of the state storage index content for Okta integration: ``` { "took": 6, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959", "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs", "_seq_no": 39, "_primary_term": 1, "_score": 1, "_source": { "v": { "ttl": 1800000000000, "updated": "2024-10-24T20:21:22.032Z", "cursor": { "published": "2024-10-24T20:19:53.542Z" } } } } ] } } ``` The naming convention for all state store is `agentless-state-<input id>`, since the expectation for agentless we would have only one agent per policy and the agents are ephemeral. Closes https://github.com/elastic/security-team/issues/11101 (cherry picked from commit 8180f23) Co-authored-by: Aleksandr Maus <[email protected]> Co-authored-by: Orestis Floros <[email protected]>
Proposed commit message
[filebeat] Elasticsearch state storage for httpjson input
This is a POC for Elasticsearch as State Store Backend for Security Integrations for Agentless solution.
The scope of this change was narrowed down to supporting only
httpjson
inputs in order to support Okta integration for the initial release. All the other integrations inputs still use the file storage as before.This is a short term solution for the state storage for k8s environment.
This is the first cut and the details can change depending on the feedback.
Current feature currently could be enabled
AGENTLESS_ELASTICSEARCH_STATE_STORE_ENABLED
, to be decided how this would be configurable in k8s.This change currently contains the hacky approach to the
AGENTLESS_ELASTICSEARCH_APIKEY
overwrite. This allows to the user to provide the ApiKey with elevated permissions that are required in order to be able to create/write/read the state index per input. THIS IS FOR DEVELOPMENT/TESTING ONLY. REMOVE BEFORE THE MERGE.The existing code relied on the inputs state storage to be fully configurable before the main beat managers runs. The change delays the configuration of
httpjson
input to the time when the actual configuration is received from the Agent.There is an assumption that the index template for the state storage indices is already in place before the storage is used
Example of the state storage index content for Okta integration:
The naming convention for all state store is
agentless-state-<input id>
, since the expectation for agentless we would have only one agent per policy and the agents are ephemeral.Currently in order to run the agent with Elasticsearch state storage a couple of environment variables would be required:
where the ApiKey in the
DEPENDENCIES / TODOS:
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Disruptive User Impact
The change should have no impact, and without the feature enabled the filebeat should work as before using the file system storage for the state.