Skip to content
This repository has been archived by the owner on Jun 26, 2024. It is now read-only.

Handling Clock-Skews in Distributed Traces #90

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

suddendust
Copy link

@suddendust suddendust commented Jun 30, 2021

Description

This PR is to handle clock-skews as described here. It is a work-in-progress as the proposal is yet to be approved.

Spans are changed in gateway-service as a part of the span transformation pipeline. A span transformation pipeline registers a list of handlers that transform a supplied list of spans. One such handler is ClockskewAdjuster.java. I chose this design as this takes care of any new transformations we might have to apply to spans in the query layer.

The design assumes that there can be a bunch of strategies to adjust clock-skew (new heuristics can be added in future). Right now we have two strategies - A NoOpClockskewAdjuster adjuster (doesn't do any adjustment) and a JaegerBasedClockskewAdjuster (copies what Jaegar does). What adjuster to use can be controlled from the app config. Note that this PR does not handle controlling this feature from the UI. That is, once the config has been specified, all spans will then be adjusted.

This PR is still in early stages. JaegerBasedClockskewAdjuster.java's implementation is a bit terse and needs to be improved. I raised this so that everyone is updated on how we are implementing this feature.

Testing

ToDo

Checklist:

  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • Any dependent changes have been merged and published in downstream modules

Documentation

ToDo

1. Added ClockskewAdjuster.java, NoOpClockskewAdjuster.java and JaegarBasedClockskewAdjuster.java.
2. Added lombok dependency to gateway-service.
2. Added SpanTransformationPipeline.java.
3. Added factory for ClockskewAdjuster.java.
@suddendust suddendust requested a review from a team June 30, 2021 10:51
@jcchavezs
Copy link

I think in this kind of fixes it is crucial to have the test cases to understand what use cases we are addressing, otherwise it is hard to follow (yeah, time is evil).

@codecov
Copy link

codecov bot commented Jun 30, 2021

Codecov Report

Merging #90 (9ef3143) into main (5f4c8f9) will decrease coverage by 1.03%.
The diff coverage is 25.96%.

❗ Current head 9ef3143 differs from pull request most recent head 5fd806e. Consider uploading reports for the commit 5fd806e to get more accurate results
Impacted file tree graph

@@             Coverage Diff              @@
##               main      #90      +/-   ##
============================================
- Coverage     77.07%   76.03%   -1.04%     
- Complexity     1096     1106      +10     
============================================
  Files           106      111       +5     
  Lines          4981     5083     +102     
  Branches        425      430       +5     
============================================
+ Hits           3839     3865      +26     
- Misses          954     1029      +75     
- Partials        188      189       +1     
Flag Coverage Δ
unit 76.03% <25.96%> (-1.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...way/service/span/JaegarBasedClockskewAdjuster.java 0.00% <0.00%> (ø)
...rtrace/gateway/service/span/ClockskewAdjuster.java 35.71% <35.71%> (ø)
...ce/gateway/service/span/NoOpClockskewAdjuster.java 50.00% <50.00%> (ø)
...g/hypertrace/gateway/service/span/SpanService.java 11.32% <60.00%> (+1.61%) ⬆️
...trace/gateway/service/span/ClockskewAdjusters.java 63.63% <63.63%> (ø)
...teway/service/span/SpanTransformationPipeline.java 66.66% <66.66%> (ø)
...hypertrace/gateway/service/GatewayServiceImpl.java 18.93% <100.00%> (+0.48%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5f4c8f9...5fd806e. Read the comment docs.

@suddendust
Copy link
Author

suddendust commented Jun 30, 2021

I think in this kind of fixes it is crucial to have the test cases to understand what use cases we are addressing, otherwise it is hard to follow (yeah, time is evil).

Will be adding the tests now. Raised this a bit earlier to keep everyone updated.

@suddendust suddendust closed this Jun 30, 2021
@suddendust suddendust reopened this Jun 30, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants