Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for data stream overrides for specific integrations based on a defined list #1913

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions internal/testrunner/runners/system/tester.go
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,9 @@ var (
},
},
}
dataStreamOverrides = map[string]bool{
"amazon_security_lake": true,
Copy link
Member

@jsoriano jsoriano Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't add exceptions to specific packages in elastic-package. We should have generic mechanisms so this is available for any package that supports it.

On this specific case, we have code to support routing rules, as well as custom datasets. If this is not enough for this package we should check why.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support for reroute was added in #1372 and #1391.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsoriano the reroute was added for input packages, not integration packages if I recall.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The examples and the test package in #1372 are integration packages.

}
enableIndependentAgents = environment.WithElasticPackagePrefix("TEST_ENABLE_INDEPENDENT_AGENT")
)

Expand Down Expand Up @@ -937,7 +940,7 @@ func (r *tester) prepareScenario(ctx context.Context, config *testConfig, svcInf

// Input packages can set `data_stream.dataset` by convention to customize the dataset.
dataStreamDataset := ds.Inputs[0].Streams[0].DataStream.Dataset
if scenario.pkgManifest.Type == "input" {
if scenario.pkgManifest.Type == "input" || dataStreamOverrides[scenario.pkgManifest.Name] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should do this in all cases and not only for input packages 🤔 Though integration packages should not need to configure datasets.

Or maybe this amazon_security_lake package should be an input package if it is expected to be used with custom datasets?

Copy link
Author

@ShourieG ShourieG Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this for all integration packages seems unnecessary and might cause unexpected results in tests. The thing is with this specific integration, there's a central data stream that runs the tests, but then reroutes the results to a separate data stream. This reroute mechanism causes the system test to fail, because elastic-package always checks for hits in the data stream the test is running. In future other integration packages might also do this, hence why I created this override list.

This also cannot be an input package cause it has multiple data streams and ingest pipelines have a lot of mapping logic. Generally input packages just consume raw inputs in a single generic data stream and dump them into elastic search.

Copy link
Author

@ShourieG ShourieG Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can add support for a new package spec variable, whose presence could trigger this ? Then we can remove the list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is with this specific integration, there's a central data stream that runs the tests, but then reroutes the results to a separate data stream.

I see we have some packages with routing rules, but don't have system tests, it would be good to identify what is missing and fix it so all these packages can have system tests.

On the other hand I see that the entityanalytics_entra_id package has routing rules and system tests. Is it doing some trick to workaround the current limitations? Or is this package doing something we could apply in other cases?

Maybe we can add support for a new package spec variable, whose presence could trigger this ? Then we can remove the list.

Yes, I would prefer this before adding a list with exceptions, but lets try to identify first what is missing, or what other packages are doing to have system tests with routing rules.

I think it would be good to open an issue identifying the problem, and then we could discuss about possible solutions. @ShourieG would this work for you?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsoriano I saw the entityanalytics_entra_id integration, and I don't think the systems tests over there are working properly, if you see the sample_event.json, it only contains agent data and does not contain any fields under the entityanalytics_entra_id namespace as per the field mappings. It also seems that the entityanalytics input has some internal mechanism to publish logs to all datasets which is not supported by the aws input, hence why a sample_event.json is even being generated in the first place even though the contents are incorrect.

v, _ := config.Vars.GetValue("data_stream.dataset")
if dataset, ok := v.(string); ok && dataset != "" {
dataStreamDataset = dataset
Expand Down Expand Up @@ -1290,7 +1293,7 @@ func (r *tester) validateTestScenario(ctx context.Context, result *testrunner.Re
}
expectedDatasets = []string{expectedDataset}
}
if scenario.pkgManifest.Type == "input" {
if scenario.pkgManifest.Type == "input" || dataStreamOverrides[scenario.pkgManifest.Name] {
v, _ := config.Vars.GetValue("data_stream.dataset")
if dataset, ok := v.(string); ok && dataset != "" {
expectedDatasets = append(expectedDatasets, dataset)
Expand Down