Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Freshness tracker verbosity could be improved for errors and debugging #59

Open
jyates opened this issue Jun 29, 2022 · 0 comments
Open
Assignees

Comments

@jyates
Copy link
Contributor

jyates commented Jun 29, 2022

For example, when a consumer is not available from burrow we dump a large error message in the logs

2022-06-29 09:03:52 ERROR [main] c.t.d.c.f.ConsumerFreshness:312 - Failed to read Burrow status for consumer example.missing.consumer. Skipping
java.io.IOException: Response was not successful: Response{protocol=http/1.1, code=404, message=Not Found, url=http://my.burrow/v3/kafka/my-cluster/consumer/example.missing.consumer/lag}
        at com.tesla.data.consumer.freshness.Burrow.request(Burrow.java:95)
        at com.tesla.data.consumer.freshness.Burrow.getConsumerGroupStatus(Burrow.java:111)
        at com.tesla.data.consumer.freshness.Burrow$ClusterClient.getConsumerGroupStatus(Burrow.java:144)
        at com.tesla.data.consumer.freshness.ConsumerFreshness.measureConsumer(ConsumerFreshness.java:307)
        at com.tesla.data.consumer.freshness.ConsumerFreshness.measureCluster(ConsumerFreshness.java:271)
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
        at java.util.stream.ReferencePipeline$11$1.accept(ReferencePipeline.java:440)
        at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
        at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
        at java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:290)
        at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)

But these consumers can be missing lag information if burrow has include/exclusions, making these error messages just clog the logs.

Conversely, its hard to diagnose a bug for a consumer if you don't know what freshness tracker is seeing. For example, a consumer is showing as having increasing lag but burrow & kafka both say that it is up-to-date on the latest commit (this occurred recently). If this persists past a freshness-tracker restart, something is wonky in the tracker and you would want to turn on some debug logging (even if it is verbose) to see what is going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant