loki.secretfilter: utilize fuzz testing for complex input #2630

kelnage · 2025-02-06T09:25:49Z

PR Description

Utilizing go's built-in fuzzing framework, test three major parts of the component that accept complex data (the log line, the component config, and the Gitleaks config), to help provide evidence that they operate correctly in many different configurations.

Which issue(s) this PR fixes

Notes to the Reviewer

The fuzzing identified three issues with the component that have been fixed as part of these commits. I have included the testdata the fuzzing framework generated, but I was wondering whether or not you felt they were useful to include or if I should simply build unit tests to validate their findings instead.

PR Checklist

CHANGELOG.md updated
Tests updated

Using three different configurations, fuzz test the processEntry method, seeded with varied log formats.

Rather than fuzzing select configs individually, combine into a single fuzz test that generates inputs for most of the configurations and tests them all together.

Fuzz test the component config and executing it against some sample log lines. The goal here is not to fuzz test Alloy's config parsing, nor go's regex parsing, but focusing on ensuring a valid config doesn't cause the component to crash when it is being used.

Test how the component handles the variety of Gitleaks configs that can be provided to the component. Note this does not test the toml parser itself, which is out of scope for this testing.

When a Gitleaks toml file contains either an empty regex or a regex that matches the empty string, the component would out-of-memory when attempting to redact. This commit excludes both of these cases.

If a rule regex could match an empty string, replacing it could quickly lead to significant memory usage. Since an empty string cannot meaningfully be a secret, add a check to validate this and skip it. Found via go fuzzing of the Gitleaks config, sample included.

…zz-testing

mostafa

LGTM! Well done! 👏

mostafa · 2025-02-06T09:38:38Z

internal/component/loki/secretfilter/testdata/fuzz/FuzzGitleaksConfig/21d8d601da55143a

@@ -0,0 +1,4 @@
+go test fuzz v1


I am not sure if these are needed, and would be happy to hear other reviewers' opinions. I guess including them as test cases are enough.

…zz-testing

ptodev · 2025-02-07T16:39:47Z

internal/component/loki/secretfilter/secretfilter_test.go

+			// ignore parsing errors, as we aren't fuzz testing the Alloy config parser
+			return


Even if we're not fuzz testing the config parser, wouldn't we still want to know if this fails? Presumably it should succeed so that we can test other code? And if it fails silently, it could mean the fuzz tests didn't really run?

I guess that testConfigs contains some invalid configurations and we want to ignore them, but wouldn't it be cleaner to only have valid configs as input and to not have config as an input to f.Fuzz?

ptodev · 2025-02-07T17:06:08Z

internal/component/loki/secretfilter/secretfilter_test.go

+		}
+
+		entry := loki.Entry{Labels: model.LabelSet{}, Entry: logproto.Entry{Timestamp: time.Now(), Line: log}}
+		c.processEntry(entry)


Should we not also check the output on ch1?

ptodev · 2025-02-07T17:09:49Z

internal/component/loki/secretfilter/testdata/fuzz/FuzzGitleaksConfig/d7a66f98cb620b33

+string("\n\t\tforward_to = []\n\t\tredact_with = \"<ALLOY-REDACTED-SECR<ALLOY-ET:$SECRET_NAME>\"\n\t")
+string("\n\t\ttitle = \"gitleaks custom config\"\n\n\t\t[[rules]]\n\t\tid = \"my-fake-secret\"\n\t\tdescription = \"Identified a fake secret\"\n\t\tregex = '''(?i)\\b(fakeSecret\\d\", \"faiption ='|\\\"|\\n|\\r||;]|$)'''\t\n\t\t[allowlist]\n\t\tregexes = [\"abc\\\\d{3}\", \"fakeSecret[9]{5}\"]\n\t")


I'm new to fuzz testing - what sorts of variations of these configs is Go actually going to come up with? E.g. if it just puts in some random malformed string as a config, how do we know we should reject it? And even for secrets, how can we make sure in our test that we filter the secrets we want to filter?

kelnage added 10 commits January 31, 2025 10:25

Fuzz test the processEntry method

4bc2cdf

Using three different configurations, fuzz test the processEntry method, seeded with varied log formats.

Simplify fuzzing of ProcessEntry

b30d3b7

Rather than fuzzing select configs individually, combine into a single fuzz test that generates inputs for most of the configurations and tests them all together.

Add fuzz testing for Gitleaks config

9226aa1

Test how the component handles the variety of Gitleaks configs that can be provided to the component. Note this does not test the toml parser itself, which is out of scope for this testing.

Fix bugs uncovered by fuzz testing

4deb959

When a Gitleaks toml file contains either an empty regex or a regex that matches the empty string, the component would out-of-memory when attempting to redact. This commit excludes both of these cases.

Add testdata to validate fixed bugs

3c35f7f

Update changelog to reflect fixes

af7887f

Merge branch 'main' of github.com:grafana/alloy into secret-filter-fu…

1040fd1

…zz-testing

Reformat changelog

c6bc434

kelnage added the enhancement New feature or request label Feb 6, 2025

kelnage requested a review from a team February 6, 2025 09:25

kelnage self-assigned this Feb 6, 2025

kelnage requested a review from a team as a code owner February 6, 2025 09:25

mostafa approved these changes Feb 6, 2025

View reviewed changes

kelnage added 2 commits February 7, 2025 14:04

Fix fuzz-go to fuzz multiple funcs in single file

e4c0f52

Merge branch 'main' of github.com:grafana/alloy into secret-filter-fu…

68f6057

…zz-testing

ptodev reviewed Feb 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

loki.secretfilter: utilize fuzz testing for complex input #2630

loki.secretfilter: utilize fuzz testing for complex input #2630

kelnage commented Feb 6, 2025

mostafa left a comment

mostafa Feb 6, 2025

ptodev Feb 7, 2025

ptodev Feb 7, 2025

ptodev Feb 7, 2025

		// ignore parsing errors, as we aren't fuzz testing the Alloy config parser
		return

		string("\n\t\tforward_to = []\n\t\tredact_with = \"<ALLOY-REDACTED-SECR<ALLOY-ET:$SECRET_NAME>\"\n\t")
		string("\n\t\ttitle = \"gitleaks custom config\"\n\n\t\t[[rules]]\n\t\tid = \"my-fake-secret\"\n\t\tdescription = \"Identified a fake secret\"\n\t\tregex = '''(?i)\\b(fakeSecret\\d\", \"faiption ='\|\\\"\|\\n\|\\r\|\|;]\|$)'''\t\n\t\t[allowlist]\n\t\tregexes = [\"abc\\\\d{3}\", \"fakeSecret[9]{5}\"]\n\t")

loki.secretfilter: utilize fuzz testing for complex input #2630

Are you sure you want to change the base?

loki.secretfilter: utilize fuzz testing for complex input #2630

Conversation

kelnage commented Feb 6, 2025

PR Description

Which issue(s) this PR fixes

Notes to the Reviewer

PR Checklist

mostafa left a comment

Choose a reason for hiding this comment

mostafa Feb 6, 2025

Choose a reason for hiding this comment

ptodev Feb 7, 2025

Choose a reason for hiding this comment

ptodev Feb 7, 2025

Choose a reason for hiding this comment

ptodev Feb 7, 2025

Choose a reason for hiding this comment