-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
loki.secretfilter: utilize fuzz testing for complex input #2630
base: main
Are you sure you want to change the base?
Conversation
Using three different configurations, fuzz test the processEntry method, seeded with varied log formats.
Rather than fuzzing select configs individually, combine into a single fuzz test that generates inputs for most of the configurations and tests them all together.
Fuzz test the component config and executing it against some sample log lines. The goal here is not to fuzz test Alloy's config parsing, nor go's regex parsing, but focusing on ensuring a valid config doesn't cause the component to crash when it is being used.
Test how the component handles the variety of Gitleaks configs that can be provided to the component. Note this does not test the toml parser itself, which is out of scope for this testing.
When a Gitleaks toml file contains either an empty regex or a regex that matches the empty string, the component would out-of-memory when attempting to redact. This commit excludes both of these cases.
If a rule regex could match an empty string, replacing it could quickly lead to significant memory usage. Since an empty string cannot meaningfully be a secret, add a check to validate this and skip it. Found via go fuzzing of the Gitleaks config, sample included.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Well done! 👏
@@ -0,0 +1,4 @@ | |||
go test fuzz v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if these are needed, and would be happy to hear other reviewers' opinions. I guess including them as test cases are enough.
// ignore parsing errors, as we aren't fuzz testing the Alloy config parser | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we're not fuzz testing the config parser, wouldn't we still want to know if this fails? Presumably it should succeed so that we can test other code? And if it fails silently, it could mean the fuzz tests didn't really run?
I guess that testConfigs
contains some invalid configurations and we want to ignore them, but wouldn't it be cleaner to only have valid configs as input and to not have config
as an input to f.Fuzz
?
} | ||
|
||
entry := loki.Entry{Labels: model.LabelSet{}, Entry: logproto.Entry{Timestamp: time.Now(), Line: log}} | ||
c.processEntry(entry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we not also check the output on ch1
?
string("\n\t\tforward_to = []\n\t\tredact_with = \"<ALLOY-REDACTED-SECR<ALLOY-ET:$SECRET_NAME>\"\n\t") | ||
string("\n\t\ttitle = \"gitleaks custom config\"\n\n\t\t[[rules]]\n\t\tid = \"my-fake-secret\"\n\t\tdescription = \"Identified a fake secret\"\n\t\tregex = '''(?i)\\b(fakeSecret\\d\", \"faiption ='|\\\"|\\n|\\r||;]|$)'''\t\n\t\t[allowlist]\n\t\tregexes = [\"abc\\\\d{3}\", \"fakeSecret[9]{5}\"]\n\t") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm new to fuzz testing - what sorts of variations of these configs is Go actually going to come up with? E.g. if it just puts in some random malformed string as a config, how do we know we should reject it? And even for secrets, how can we make sure in our test that we filter the secrets we want to filter?
PR Description
Utilizing go's built-in fuzzing framework, test three major parts of the component that accept complex data (the log line, the component config, and the Gitleaks config), to help provide evidence that they operate correctly in many different configurations.
Which issue(s) this PR fixes
Notes to the Reviewer
The fuzzing identified three issues with the component that have been fixed as part of these commits. I have included the testdata the fuzzing framework generated, but I was wondering whether or not you felt they were useful to include or if I should simply build unit tests to validate their findings instead.
PR Checklist