Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[filebeat] Elasticsearch state storage for httpjson and cel inputs #41446

Merged
merged 77 commits into from
Jan 28, 2025

Conversation

aleksmaus
Copy link
Contributor

@aleksmaus aleksmaus commented Oct 24, 2024

Proposed commit message

[filebeat] Elasticsearch state storage for httpjson input

This is a POC for Elasticsearch as State Store Backend for Security Integrations for Agentless solution.

The scope of this change was narrowed down to supporting only httpjson inputs in order to support Okta integration for the initial release. All the other integrations inputs still use the file storage as before.
This is a short term solution for the state storage for k8s environment.

This is the first cut and the details can change depending on the feedback.

Current feature currently could be enabled AGENTLESS_ELASTICSEARCH_STATE_STORE_ENABLED, to be decided how this would be configurable in k8s.

This change currently contains the hacky approach to the AGENTLESS_ELASTICSEARCH_APIKEY overwrite. This allows to the user to provide the ApiKey with elevated permissions that are required in order to be able to create/write/read the state index per input. THIS IS FOR DEVELOPMENT/TESTING ONLY. REMOVE BEFORE THE MERGE.

The existing code relied on the inputs state storage to be fully configurable before the main beat managers runs. The change delays the configuration of httpjson input to the time when the actual configuration is received from the Agent.

There is an assumption that the index template for the state storage indices is already in place before the storage is used

PUT _index_template/agentless_state_template
{
  "index_patterns": [
    "agentless-state-*"
  ],
  "priority": 300,
  "template": {
    "mappings": {
      "properties": {
        "v": {
          "type": "object",
          "enabled": false
        },
        "updated_at": {
          "type": "date",
          "format": "strict_date_optional_time||epoch_millis"
        }
      }
    },
    "settings": {
      "number_of_shards": 1
    }
  }
}

Example of the state storage index content for Okta integration:

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959",
        "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs",
        "_seq_no": 39,
        "_primary_term": 1,
        "_score": 1,
        "_source": {
          "v": {
            "ttl": 1800000000000,
            "updated": "2024-10-24T20:21:22.032Z",
            "cursor": {
              "published": "2024-10-24T20:19:53.542Z"
            }
          }
        }
      }
    ]
  }
}

The naming convention for all state store is agentless-state-<input id>, since the expectation for agentless we would have only one agent per policy and the agents are ephemeral.

Currently in order to run the agent with Elasticsearch state storage a couple of environment variables would be required:

sudo AGENTLESS_ELASTICSEARCH_STATE_STORE_ENABLED=1 AGENTLESS_ELASTICSEARCH_APIKEY=xxxxxxxx-xvpDXfB:jVMRsW7SRIxxxxxxxxx ./elastic-agent -e

where the ApiKey in the

DEPENDENCIES / TODOS:

  • Approval of teams for this approach
  • Kibana (?) side change is required for the agentless-state index template boostrapping
  • Kibana or the intergration package (or both) change is required in order to include the permissions for agentless-state- with the Elasticsearch ApiKey (Remove the hack). I suspect that Kibana fleet code could be modified to recognize agentless supporting integration and include the proper index name for the agentless-state for the ApiKey permissions.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

The change should have no impact, and without the feature enabled the filebeat should work as before using the file system storage for the state.

@aleksmaus aleksmaus self-assigned this Oct 24, 2024
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Oct 24, 2024
@aleksmaus aleksmaus added the Team:Security-Deployment and Devices Deployment and Devices Team in Security Solution label Oct 24, 2024
Copy link
Contributor

mergify bot commented Oct 24, 2024

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @aleksmaus? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit

Copy link
Contributor

mergify bot commented Oct 24, 2024

backport-8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label and remove the backport-8.x label.

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Oct 24, 2024
@mergify mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Oct 24, 2024
@aleksmaus aleksmaus changed the title [filebeat] Elasticsearch state storage for httpjson input [filebeat] Elasticsearch state storage for httpjson and cel inputs Oct 30, 2024
@aleksmaus
Copy link
Contributor Author

@belimawr @cmacknz (or whoever wants/have time to be involved)
I need your feedback on this draft, if this approach is something that we could eventually merge (the ApiKey workaround will be removed once we adjust the kibana fleet).
I think this is an ok solution given the circumstances:

  1. This is fully backwards compatible. If the feature is not enabled, everything would work as before.
  2. The only inputs that are enabled for Elasticsearch backed state storage are the httpjson and cel. We only enabling the limited number of integration relying on httpjson or cel inputs for the first release.
  3. The state initialization for the inputs is delayed until we get the configuration only if the feature is enabled for the input.
  4. The agent logs monitoring will still use the local storage, since we agreed that loosing the agent log when the pod relocated is acceptable.

@cmacknz
Copy link
Member

cmacknz commented Nov 1, 2024

@leehinman I'd appreciate a review here to make sure this can co-exist with Beats receivers in agent since that would be the long term way we plan to run agentless inputs.

@cmacknz cmacknz requested a review from leehinman November 1, 2024 20:23
filebeat/beater/filebeat.go Outdated Show resolved Hide resolved
}

func (s *store) get(key string, to interface{}) error {
status, data, err := s.cli.Request("GET", fmt.Sprintf("/%s/%s/%s", s.index, docType, url.QueryEscape(key)), "", nil, nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to instrument this function with its own metrics, so we can observe the frequency of 429s, or just instrument it with tracing since that'll always be available where we plan to use it. There needs to be a way to tell if this is working properly.

This can be done as a follow up to this PR however.

Copy link
Member

@cmacknz cmacknz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the new test looks good.

Biggest outstanding thing I see is some ability to tell how well this is working and if it is overloading ES. We can address this as a follow up.

@@ -81,7 +82,7 @@ type Filebeat struct {
type PluginFactory func(beat.Info, *logp.Logger, StateStore) []v2.Plugin

type StateStore interface {
Access() (*statestore.Store, error)
Access(typ string) (*statestore.Store, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Godoc for this parameter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest something like

	// Access returns the storage registry depending on the type.
	// The value of typ is expected to have been obtained from
	// [cursor.InputManager.Type] and represents the input type.

Copy link
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM is limited to simple changes at x-pack/filebeat/input/awss3/.

Nit: calling a method with an empty string (stateStore.Access("")) feels slightly weird.

Copy link
Contributor

@ishleenk17 ishleenk17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving it from obs-infraobs side while considering the change done to the salesforce input.

Copy link
Contributor

@tommyers-elastic tommyers-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewing (minimal) changes in x-pack/filebeat on behalf of obs-infraobs-integrations.

looks fine to me.

@orestisfl
Copy link
Contributor

run docs-build

@orestisfl
Copy link
Contributor

/test

@orestisfl orestisfl enabled auto-merge (squash) January 28, 2025 14:00
@orestisfl orestisfl merged commit 8180f23 into elastic:main Jan 28, 2025
141 checks passed
mergify bot pushed a commit that referenced this pull request Jan 28, 2025
…41446)

This enables Elasticsearch as State Store Backend for Security Integrations for
the Agentless solution.

The scope of this change was narrowed down to supporting only `httpjson` inputs
in order to support Okta integration for the initial release. All the other
integrations inputs still use the file storage as before.
This is a short term solution for the state storage for k8s.

The feature currently can only be enabled with the
`AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES` env var.

The existing code relied on the inputs state storage to be fully configurable
before the main beat managers runs. The change delays the configuration of
`httpjson` input to the time when the actual configuration is received from the
Agent.

Example of the state storage index content for Okta integration:
```
{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959",
        "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs",
        "_seq_no": 39,
        "_primary_term": 1,
        "_score": 1,
        "_source": {
          "v": {
            "ttl": 1800000000000,
            "updated": "2024-10-24T20:21:22.032Z",
            "cursor": {
              "published": "2024-10-24T20:19:53.542Z"
            }
          }
        }
      }
    ]
  }
}
```

The naming convention for all state store is `agentless-state-<input id>`,
since the expectation for agentless we would have only one agent per policy and
the agents are ephemeral.

Closes https://github.com/elastic/security-team/issues/11101

Co-authored-by: Orestis Floros <[email protected]>
(cherry picked from commit 8180f23)

# Conflicts:
#	filebeat/beater/filebeat.go
orestisfl pushed a commit to orestisfl/beats that referenced this pull request Jan 29, 2025
…lastic#41446)

This enables Elasticsearch as State Store Backend for Security Integrations for
the Agentless solution.

The scope of this change was narrowed down to supporting only `httpjson` inputs
in order to support Okta integration for the initial release. All the other
integrations inputs still use the file storage as before.
This is a short term solution for the state storage for k8s.

The feature currently can only be enabled with the
`AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES` env var.

The existing code relied on the inputs state storage to be fully configurable
before the main beat managers runs. The change delays the configuration of
`httpjson` input to the time when the actual configuration is received from the
Agent.

Example of the state storage index content for Okta integration:
```
{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959",
        "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs",
        "_seq_no": 39,
        "_primary_term": 1,
        "_score": 1,
        "_source": {
          "v": {
            "ttl": 1800000000000,
            "updated": "2024-10-24T20:21:22.032Z",
            "cursor": {
              "published": "2024-10-24T20:19:53.542Z"
            }
          }
        }
      }
    ]
  }
}
```

The naming convention for all state store is `agentless-state-<input id>`,
since the expectation for agentless we would have only one agent per policy and
the agents are ephemeral.

Closes https://github.com/elastic/security-team/issues/11101

Co-authored-by: Orestis Floros <[email protected]>
(cherry picked from commit 8180f23)

# Conflicts:
#	filebeat/beater/filebeat.go
orestisfl added a commit that referenced this pull request Jan 29, 2025
…41446)

This enables Elasticsearch as State Store Backend for Security Integrations for
the Agentless solution.

The scope of this change was narrowed down to supporting only `httpjson` inputs
in order to support Okta integration for the initial release. All the other
integrations inputs still use the file storage as before.
This is a short term solution for the state storage for k8s.

The feature currently can only be enabled with the
`AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES` env var.

The existing code relied on the inputs state storage to be fully configurable
before the main beat managers runs. The change delays the configuration of
`httpjson` input to the time when the actual configuration is received from the
Agent.

Example of the state storage index content for Okta integration:
```
{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959",
        "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs",
        "_seq_no": 39,
        "_primary_term": 1,
        "_score": 1,
        "_source": {
          "v": {
            "ttl": 1800000000000,
            "updated": "2024-10-24T20:21:22.032Z",
            "cursor": {
              "published": "2024-10-24T20:19:53.542Z"
            }
          }
        }
      }
    ]
  }
}
```

The naming convention for all state store is `agentless-state-<input id>`,
since the expectation for agentless we would have only one agent per policy and
the agents are ephemeral.

Closes elastic/security-team#11101

Co-authored-by: Aleksandr Maus <[email protected]>
Co-authored-by: Orestis Floros <[email protected]>
(cherry picked from commit 8180f23)
orestisfl added a commit that referenced this pull request Jan 29, 2025
…pjson and cel inputs (#42451)

This enables Elasticsearch as State Store Backend for Security Integrations for
the Agentless solution.

The scope of this change was narrowed down to supporting only `httpjson` inputs
in order to support Okta integration for the initial release. All the other
integrations inputs still use the file storage as before.
This is a short term solution for the state storage for k8s.

The feature currently can only be enabled with the
`AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES` env var.

The existing code relied on the inputs state storage to be fully configurable
before the main beat managers runs. The change delays the configuration of
`httpjson` input to the time when the actual configuration is received from the
Agent.

Example of the state storage index content for Okta integration:
```
{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959",
        "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs",
        "_seq_no": 39,
        "_primary_term": 1,
        "_score": 1,
        "_source": {
          "v": {
            "ttl": 1800000000000,
            "updated": "2024-10-24T20:21:22.032Z",
            "cursor": {
              "published": "2024-10-24T20:19:53.542Z"
            }
          }
        }
      }
    ]
  }
}
```

The naming convention for all state store is `agentless-state-<input id>`,
since the expectation for agentless we would have only one agent per policy and
the agents are ephemeral.

Closes https://github.com/elastic/security-team/issues/11101

(cherry picked from commit 8180f23)

Co-authored-by: Aleksandr Maus <[email protected]>
Co-authored-by: Orestis Floros <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify enhancement Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team Team:Security-Deployment and Devices Deployment and Devices Team in Security Solution
Projects
None yet
Development

Successfully merging this pull request may close these issues.