Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Asset Integrations & Entity Store RFC - Stage 2 #2233

Open
wants to merge 34 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
5aabacb
Create asset.yml
SourinPaul Jul 12, 2023
5274c02
Create host.yml
SourinPaul Jul 12, 2023
fca40be
Update 0041-asset-integration.md
SourinPaul Jul 12, 2023
6d65d59
Update 0041-asset-integration.md
SourinPaul Jul 13, 2023
b0a7d06
Update and rename host.yml to user.yml
SourinPaul Jul 13, 2023
a7581ff
Update user.yml
SourinPaul Jul 13, 2023
fe2fd75
Update 0041-asset-integration.md
SourinPaul Jul 13, 2023
6a068c1
Update asset.yml
SourinPaul Jul 13, 2023
034c97c
Create os.yml
SourinPaul Jul 13, 2023
2c90207
Update rfcs/text/0041-asset-integration.md
SourinPaul Jul 19, 2023
51a966e
Update rfcs/text/0041/asset.yml
SourinPaul Jul 19, 2023
6c9609e
Update rfcs/text/0041/user.yml
SourinPaul Jul 19, 2023
086b7f4
Update rfcs/text/0041/user.yml
SourinPaul Jul 19, 2023
117d0ed
Update rfcs/text/0041/user.yml
SourinPaul Jul 19, 2023
3a8abf4
Update rfcs/text/0041/user.yml
SourinPaul Jul 19, 2023
a765da3
Update rfcs/text/0041/user.yml
SourinPaul Jul 19, 2023
e593af2
Update rfcs/text/0041/user.yml
SourinPaul Jul 19, 2023
ccf7b23
Update rfcs/text/0041/user.yml
SourinPaul Jul 19, 2023
525c108
Update rfcs/text/0041/asset.yml
SourinPaul Jul 19, 2023
4331778
Update rfcs/text/0041/asset.yml
SourinPaul Jul 19, 2023
5c8f251
Update asset.yml
SourinPaul Jul 19, 2023
64827ec
Update 0041-asset-integration.md
SourinPaul Jul 25, 2023
1a2e6dc
Update 0041-asset-integration.md
SourinPaul Aug 9, 2023
6603b1e
Update 0041-asset-integration.md
SourinPaul Aug 9, 2023
69a82c4
Merge branch 'main' into Stage1_2215_assetintegrationRFC
ebeahan Aug 28, 2023
6fd05a0
Update os.yml
SourinPaul Jan 5, 2024
73b53e0
Update asset.yml
SourinPaul Jan 5, 2024
150292b
Update user.yml
SourinPaul Jan 5, 2024
ef6f6d8
Update asset.yml
SourinPaul Jan 6, 2024
5e61c8e
Update os.yml
SourinPaul Jan 6, 2024
a570e8b
Update user.yml
SourinPaul Jan 6, 2024
f74c82f
Update 0041-asset-integration.md
SourinPaul Jan 30, 2024
d9c9875
Merge branch 'main' into Stage1_2215_assetintegrationRFC
ebeahan Feb 21, 2024
dc62101
revisted to stage 2
ebeahan Feb 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 137 additions & 14 deletions rfcs/text/0041-asset-integration.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# 0041: Asset Integration
<!-- Leave this ID at 0000. The ECS team will assign a unique, contiguous RFC number upon merging the initial stage of this RFC. -->

- Stage: **0 (strawperson)** <!-- Update to reflect target stage. See https://elastic.github.io/ecs/stages.html -->
- Stage: **1 (Draft)** <!-- Update to reflect target stage. See https://elastic.github.io/ecs/stages.html -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we still targeting stage 1 here @SourinPaul? You mentioned stage 2 in a different conversation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ebeahan, thanks for the ping. We are currently in stage 1 and targeting stage 2.

Do I need to update this section? Please advise, or feel free to update before merging.

- Date: **2023-07-07** <!-- The ECS team sets this date at merge time. This is the date of the latest stage advancement. -->

<!--
Expand All @@ -13,6 +13,10 @@ Feel free to remove these comments as you go along.
Stage 0: Provide a high level summary of the premise of these changes. Briefly describe the nature, purpose, and impact of the changes. ~2-5 sentences.
-->

<!--
Stage 1: If the changes include field additions or modifications, please create a folder titled as the RFC number under rfcs/text/. This will be where proposed schema changes as standalone YAML files or extended example mappings and larger source documents will go as the RFC is iterated upon.
-->

This proposal extends the existing ECS field set to store inventory metadata for hosts and users from external application repositories. Using ECS to store such fields will improve metadata querying and retrieval across various use cases.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description primarily focuses on hosts and users. Have we considered the benefits of expanding the asset schema to encompass a broader range of assets from the start?

An asset schema that enables us to capture a broader range of assets is essential for developing new asset-centric features and experiences, such as the proposed asset inventory experience/workflow.

An extensible schema will also be crucial for specific use cases like our proposed enhancements to SIEM to enable better Cloud Detection and Response (CDR), where accurately modeling and representing a diverse array of cloud assets beyond hosts and users is a crucial requirement for the experiences we want to deliver. For example, today, in our SIEM, we have threat detection rules for AWS and other CSPs that detect malicious activity related to non-host and user entities. When these detection rules trigger, our current alert flyout will only highlight host and user assets as being present, even though the detection rule explicitly mentions other assets too (ex., RDS database, SecurityGroup, etc.); having a standardized schema is one of the first steps in addressing enhancements like this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Terminologies:
Expand All @@ -23,12 +27,7 @@ This proposal includes the following:
* Introduces a new field set called `assets`.
<!-- * Additional fields in the `host` object --->

This proposal will also facilitate storing host and user inventory within the security solution (the entity store).


<!--
Stage 1: If the changes include field additions or modifications, please create a folder titled as the RFC number under rfcs/text/. This will be where proposed schema changes as standalone YAML files or extended example mappings and larger source documents will go as the RFC is iterated upon.
-->
This proposal will also facilitate storing host and user inventory within the security solution (the entity store). Schema/ field sets defined here focus on asset inventory data sources. Additional fields may need to be appended (ideally within this RFC lifecycle) to support the entity store needs.

<!--
Stage X: Provide a brief explanation of why the proposal is being marked as abandoned. This is useful context for anyone revisiting this proposal or considering similar changes later on.
Expand Down Expand Up @@ -59,8 +58,6 @@ user.profile.job_title | keyword | Field Sales | Job title assigned to the user
user.profile.department | keyword | x256 | Department name associated with the user account.
user.profile.organization | keyword | Elasticsearch Inc. | Organization name associated with the account.
user.profile.location | keyword | US - Washington - Distributed | Assigned location for the user account.
user.profile.mobile_phone | keyword | 222-222-2222
user.profile.primaryPhone | keyword | 222-222-2222
user.profile.secondEmail | keyword | [email protected] | Additional email addresses associated with the user account.
user.profile.sup_org_id | keyword | SUP-ORG-75 | Primary organization ID for the user account.
user.profile.supervisory_Org | keyword | Field Sales | Primary organization name for the user account.
Expand Down Expand Up @@ -135,13 +132,17 @@ Stage 2: Add or update all remaining field definitions. The list should now be e
Stage 1: Describe at a high-level how these field changes will be used in practice. Real world examples are encouraged. The goal here is to understand how people would leverage these fields to gain insights or solve problems. ~1-3 paragraphs.
-->

* As part of Entity Analytics, we are ingesting metadata about Users and from various external vendor applications. We are storing all ingested metadata in Elasticsearch. After we map these fields to ECS, we will enrich these ingested events for risk-scoring scenarios (e.g., context enrichments) and detecting advanced analytics (UBA) use cases.
* As part of Entity Analytics, we are ingesting metadata about Users and from various external vendor applications. We are storing all ingested metadata in Elasticsearch. After we map these fields to ECS, we will enrich these ingested events for risk-scoring scenarios (e.g., context enrichments) and detecting advanced analytics (UEBA) use cases.

### Example of Hosts and Users stored in ES

* This schema will persist `Observed` (queried) entities from the ingested security log dataset in an Entity store. This entity store can be further extended to meet broader Asset Management needs.

* Additional enrichment use cases for existing prebuilt detection rules will leverage these ECS fields.




## Source data

<!--
Expand All @@ -165,14 +166,139 @@ There are many sources of asset inventory repositories. In the mid-term, we are
* MS Intune
* ServiceNow Asset CMDB

### Examples of source data:

#### Subset of User fields from Okta:

```json
{
"@timestamp": "2023-07-04T09:57:19.786056-05:00",
"event": {
"action": "user-discovered"
},
"okta": {
"id": "userid",
"status": "RECOVERY",
"created": "2023-06-02T09:33:00.189752+09:30",
"activated": "0001-01-01T00:00:00Z",
"statusChanged": "2023-06-02T09:33:00.189752+09:30",
"lastLogin": "2023-06-02T09:33:00.189752+09:30",
"lastUpdated": "2023-06-02T09:33:00.189753+09:30",
"passwordChanged": "2023-06-02T09:33:00.189753+09:30",
"type": {
"id": "typeid"
},
"profile": {
"login": "[email protected]",
"email": "[email protected]",
"firstName": "name",
"lastName": "surname"
},
"credentials": {
"password": {},
"provider": {
"type": "OKTA",
"name": "OKTA"
}
},
"_links": {
"self": {
"href": "https://localhost/api/v1/users/userid"
}
}
},
"user": {
"id": "userid"
},
"labels": {
"identity_source": "okta-1"
}
}
```

<!--
Stage 2: Included a real world example source document. Ideally this example comes from the source(s) identified in stage 1. If not, it should replace them. The goal here is to validate the utility of these field changes in the context of a real world example. Format with the source name as a ### header and the example document in a GitHub code block with json formatting, or if on the larger side, add them to the corresponding RFC folder.
-->


<!--
Stage 3: Add more real world example source documents so we have at least 2 total, but ideally 3. Format as described in stage 2.
-->

### Examples of Real-world mapping:

#### Mapping User object from Okta (partial):
SourinPaul marked this conversation as resolved.
Show resolved Hide resolved
```yml
- set:
field: asset.status
copy_from: entityanalytics_okta.user.status
tag: set_asset_status
ignore_empty_value: true
- set:
field: user.profile.status
copy_from: entityanalytics_okta.user.status
tag: set_user_profile_status
ignore_empty_value: true
- date:
field: okta.created
target_field: entityanalytics_okta.user.created
tag: date_user_created
formats:
- ISO8601
if: ctx.okta?.created != null && ctx.okta.created != ''
on_failure:
- remove:
field: okta.created
- append:
field: error.message
value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} in pipeline {{{_ingest.pipeline}}} failed with message: {{{_ingest.on_failure_message}}}'
- set:
field: user.account.create_date
copy_from: entityanalytics_okta.user.created
tag: set_user_account_create_date
ignore_empty_value: true
- set:
field: asset.create_date
copy_from: entityanalytics_okta.user.created
tag: set_asset_create_date
ignore_empty_value: true
- date:
field: okta.activated
target_field: entityanalytics_okta.user.activated
tag: date_user_activated
formats:
- ISO8601
if: ctx.okta?.activated != null && ctx.okta.activated != ''
on_failure:
- remove:
field: okta.activated
- append:
field: error.message
value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} in pipeline {{{_ingest.pipeline}}} failed with message: {{{_ingest.on_failure_message}}}'
- set:
field: user.account.activated_date
copy_from: entityanalytics_okta.user.activated
tag: set_user_account_activated_date
ignore_empty_value: true
- date:
field: okta.statusChanged
target_field: entityanalytics_okta.user.status_changed
tag: date_user_status_changed
formats:
- ISO8601
if: ctx.okta?.statusChanged != null && ctx.okta.statusChanged != ''
on_failure:
- remove:
field: okta.statusChanged
- append:
field: error.message
value: 'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} in pipeline {{{_ingest.pipeline}}} failed with message: {{{_ingest.on_failure_message}}}'
```


#### AzureAD Hosts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewkroh I do not have the AzureAD samples handy yet. I will update this section in Stage 2. Thanks.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oatkiller @jaredburgettelastic do you have any sample AzureAD data on hand? If so, can you help out with this section or should we drop it?

Copy link

@jaredburgettelastic jaredburgettelastic May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to drop this. I'd even like to simply link to the source for these "real world examples", because they could change at any time. In fact...the Okta one has changed since this RFC was written!

In the case of Okta, the link to that source code would be here: https://github.com/elastic/integrations/blob/main/packages/entityanalytics_okta/data_stream/user/elasticsearch/ingest_pipeline/default.yml



## Scope of impact

<!--
Expand All @@ -195,9 +321,6 @@ The goal here is to research and understand the impact of these changes on users
Stage 1: Identify potential concerns, implementation challenges, or complexity. Spend some time on this. Play devil's advocate. Try to identify the sort of non-obvious challenges that tend to surface later. The goal here is to surface risks early, allow everyone the time to work through them, and ultimately document resolution for posterity's sake.
-->

* We have a couple of fleet integrations under development. We want them to use these proposed ECS before being released.
* Schema/ field sets defined here focus on asset inventory data sources. Additional fields may need to be appended (ideally within this RFC lifecycle) to support the entity store needs.
* Due diligence is needed to avoid the proliferation of field sets and validate business requirements.
* In stage1, @jasonrhodes identified fields from o11y use cases and a potential conflict: https://github.com/elastic/ecs/pull/2215#pullrequestreview-1498781860

<!--
Expand All @@ -218,7 +341,6 @@ The following are the people that consulted on the contents of this RFC.
* @lauravoicu | subject matter expert
* @MikePaquette | subject matter expert
* @sourinpaul | sponsor
* ?

<!--
Who will be or has been consulted on the contents of this RFC? Identify authorship and sponsorship, and optionally identify the nature of involvement of others. Link to GitHub aliases where possible. This list will likely change or grow stage after stage.
Expand All @@ -242,6 +364,7 @@ e.g.:
<!-- An RFC should link to the PRs for each of it stage advancements. -->

* Stage 0: https://github.com/elastic/ecs/pull/2215
* Stage 1: https://github.com/elastic/ecs/pull/NNN
SourinPaul marked this conversation as resolved.
Show resolved Hide resolved

<!--
* Stage 1: https://github.com/elastic/ecs/pull/NNN
Expand Down
147 changes: 147 additions & 0 deletions rfcs/text/0041/asset.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
- name: asset.category
level: extend
SourinPaul marked this conversation as resolved.
Show resolved Hide resolved
SourinPaul marked this conversation as resolved.
Show resolved Hide resolved
level: extend
type: keyword
example: hardware
description: A further classification of the asset type beyond event.category. Example for host assets {hardware, virtual, container, node}.
- name: asset.type
level: extend
level: extend
type: keyword
example: workstation
description: A sub-classification of assets. For host assets {workstation, S3, Compute}. For user assets {NULL?}
- name: asset.id
level: extend
type: keyword
example: 2950
description: A unique ID for the asset. For inventory integrations, it's the id generated from the inventory data source
- name: asset.name
level: extend
type: keyword
example: Sourin Paul Macbook Pro
description: A common name for the asset
- name: asset.vendor
level: extend
type: keyword
example: Apple
description: Used primarily for 'Host' entities, the vendor name or brand associated with the asset.
- name: asset.product
level: extend
type: keyword
example: MacBook Pro
description: Used primarily for 'Host' entities, the product name associated with the asset.
- name: asset.model
level: extend
type: keyword
example: TBD
description: Used primarily for 'Host' entities, the model name or number associated with this asset.
- name: asset.version
level: extend
type: keyword
example: TBD
description: Used primarily for 'Host' entities, the version or year associated with the asset.
- name: asset.owner
level: extend
type: keyword
example: [email protected]
description: The primary user entity who owns the 'Host' asset
- name: asset.priority
level: extend
type: keyword
example: Priority 1
description: A priority classification for the asset obtained from outside this system, such as from some external CMDB or Directory service.
- name: asset.criticality
level: extend
type: keyword
example: Critical
description: A criticality classification obtained from outside this system, such as from some external CMDB or Directory service.
- name: asset.business_unit
level: extend
type: keyword
example: Analyst Experience
description: Business Unit associated with the asset (user or host).
- name: asset.costCenter
SourinPaul marked this conversation as resolved.
Show resolved Hide resolved
level: extend
type: keyword
example: Security - Protections
description: Cost Center associated with the asset (user or host).
- name: asset.cost_center_hierarchy
level: extend
type: keyword
example: Engineering
description: Additional cost center information associated with the asset (user or host).
- name:
SourinPaul marked this conversation as resolved.
Show resolved Hide resolved
level: extend
type:
example:
description:
- name: asset.status
level: extend
type: keyword
example: ACTIVE
description: Current status of the asset in the inventory data source.
- name: asset.last_status_change_date
level: extend
type: date
example: June 5, 2023 @ 18:25:57.000
description: The most recent date/time when the asset.status was updated.
- name: asset.create_date
level: extend
type: date
example: June 5, 2023 @ 18:25:57.001
description: For users, it's the hire date. For other assets, it's the in-service date.
- name: asset.end_date
level: extend
type: date
example: June 5, 2023 @ 18:25:57.002
description: For users, it's the termination date. For other assets, it's the out-of-service date.
- name: asset.first_seen
level: extend
type: date
example: June 5, 2023 @ 18:25:57.003
description: The earliest date/time at which this asset was observed.
- name: asset.last_seen
level: extend
type: date
example: June 5, 2023 @ 18:25:57.004
description: The most recent date/time this asset was observed. It would remain empty until the asset was observed.
- name: asset.last_updated
level: extend
type: date
example: June 5, 2023 @ 18:25:57.005
description: The most recent date/time this asset was updated in the inventory data source
- name: asset.serial_number
level: extend
type: keyword
example: C02FG1G1MD6T
description: Serial number of the asset
- name: asset.tags
level: extend
type: keyword
example: watch, mdmaccess
description: Tags assigned at the MDM
SourinPaul marked this conversation as resolved.
Show resolved Hide resolved
SourinPaul marked this conversation as resolved.
Show resolved Hide resolved
- name: asset.assigned_users
level: extend
type: keyword
example: [email protected], [email protected]
description: List of users assigned to the asset
- name: asset.assigned_users_are_admin
level: extend
type: boolean
example: TRUE
description: Flag to identify if the assigned users have admin privileges
- name: asset.is_managed
level: extend
type: boolean
example: TRUE
description: Flag to identify if the organization manages the asset
- name: asset.last_enrolled_date
level: extend
type: date
example: June 5, 2023 @ 18:25:57.005
description: The most recent date/time the asset checked in with MDM
- name: asset.data_classification
level: extend
type: keyword
example: restricted
description: Data classification tier for the asset
5 changes: 5 additions & 0 deletions rfcs/text/0041/os.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
- name: build
level: extend
type: keyword
example: FIELD4: 22F66
description: Host OS Build information
Loading