-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop using zerolog/log.Logger and use logger from context instead #3078
Conversation
Stop using the global zerolog logger and use one that is passed through the context instead. When fleet-server is running normally context-logger defaults to the same as the global logger, however it is much easier to change for test cases so we can have log output that's tied to the specific test making it easier to track any errors in tests.
Change logger reload to clone the parent logger with a new output instead of creating a new logger. Fix integration tests. Use context.TODO instead of background where contexts are not passed.
Across various previous PRs we disabled setting the logger across a lot of integration tests due to race conditions. Re-enable across all integration tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very tedious pr where I mostly just replaced log.
with zerolog.Ctx(ctx).
@@ -84,7 +84,7 @@ func (d *Dispatcher) Subscribe(agentID string, seqNo sqn.SeqNo) *Sub { | |||
sz := len(d.subs) | |||
d.mx.Unlock() | |||
|
|||
log.Trace().Str(logger.AgentID, agentID).Int("sz", sz).Msg("Subscribed to action dispatcher") | |||
zerolog.Ctx(context.TODO()).Trace().Str(logger.AgentID, agentID).Int("sz", sz).Msg("Subscribed to action dispatcher") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO
is used in functions that don't take a context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we create a follow up issue for adding a context to these functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
srv, err := startTestServer(t, ctx) | ||
require.NoError(t, err) | ||
ctx = testlog.SetLogger(t).WithContext(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm setting the test logger after startTestServer
. If we set it before it can lead to some race conditions in the test code.
l.log = l.log.Output(out) | ||
l.sync = wr | ||
|
||
log.Logger = l.log | ||
zerolog.DefaultContextLogger = &l.log // introduces race conditions in integration test? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "biggest" change i made in this PR is here with how the logger is replaced on a reload operation.
This was one in an attempt to fix race conditions in the integration tests, it didn't resolve it, but it made it very clear what this part of the reload operation actually does
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me, and thanks for dealing with the tedium of replacing or updating all function signatures to accept the new logger context. We do this (passing a logger
around) all over Kibana too, so I'm pretty comfortable with this pattern - especially since it makes writing tests/assertions for specific logging cases much easier. LGTM 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
This pr was to replace all global refs. We still pass |
SonarQube Quality Gate |
What is the problem this PR solves?
I tried to change the json schema generation tool to a version that supports serializing to
time.Time
, however there were a lot of test-failures.It was very difficult to track why tests were failing because a lot of fleet-server's codebase (like the bulker) outputs to zerolog's global logger, while other sections use embedded (or per-request) loggers and the output of tests is muddled and inconsistent to read.
Trying to change the global logger for a test case introduces a race-condition failure within the test.
If we use an alternate method of getting a "global" logger that we can adjust for tests it would make reading test output much easier.
How does this PR solve the problem?
Stop using the global zerolog logger and use one that is passed through the context instead.
Set context default to the same as the global-logger (the one that is controlled through the fleet-server's
logger
package) so that when fleet-server is running normally there is no change, however it is much easier to change for test cases so we can have log output that's tied to the specific test making it easier to track any errors in tests.Change how the fleet-server logger package reloads config when adjusting for new output
How to test this PR locally
make test