Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support zipkin format instead (or along with) jaeger #95

Open
jcchavezs opened this issue Dec 31, 2020 · 4 comments
Open

Support zipkin format instead (or along with) jaeger #95

jcchavezs opened this issue Dec 31, 2020 · 4 comments
Labels

Comments

@jcchavezs
Copy link
Contributor

jcchavezs commented Dec 31, 2020

Currently debugging and reproducing issues in the pipeline is very hard and problematic. If we need to understand why a span data turned into certain values, the most basic thing you do is try to reproduce the error. At the moment we don't have a way to do that, there isn't a way to obtain the original span that was ingested at the end of the pipeline nor in the storage. This is makes debugging almost impossible and bring over the table more way more complex proposals like enabling debug mode in prod.

The current approach to overcome this issue is to deploy a jaeger instance along with the hypertrace and download the data using the jaeger download button. While this feels like a (already cumbersome) solution it isn't because what you download isn't exactly what you ingested, formats are different: jaeger ingests over zipkin format (like other zipkin forks) or using thrift and neither of them is what you download from jaeger UI (a json payload). To overcome this issue temporary I created this tool https://github.com/jcchavezs/jaeger2zipkin which allows you to convert jaeger downloaded trace into zipkin trace and then be able to re-ingest it but that isn't 100% reliable because there are two transformations in the middle.

What would truly embrace full debugability is that we can download the same format we ingest. For that we would need to support ingesting zipkin data (which is what most of our agents use for reporting) and also keep the raw payload in the messages along the kafka pipeline. We don't event need to understand zipkin format (e.g. we don't need to serialize/deserialize the raw payload on ever step in the pipeline) but just to store and serve in the API to download and re-ingest it.

This along with a download button in the UI would allow us to ask users to send us the original data and we can debug the issues locally which is what other DT solutions do.

Ping @kotharironak @tim-mwangi @JBAhire @rish691 @buchi-busireddy @sanjay-nagaraj

@jcchavezs jcchavezs changed the title Support zipkin format instead/along jaeger Support zipkin format instead (along) jaeger Dec 31, 2020
@jcchavezs jcchavezs changed the title Support zipkin format instead (along) jaeger Support zipkin format instead (or along with) jaeger Dec 31, 2020
@buchi-busireddy
Copy link
Contributor

@jcchavezs Doesn't including the original span as a blob in every span increase the Kafka message size drastically? Are you suggesting to enable that conditionally or always?
I thought if we finish the export feature that you logged earlier, we would be able to get the original span back when the trace is exported. Isn't it?

@jcchavezs
Copy link
Contributor Author

jcchavezs commented Jan 3, 2021 via email

@buchi-busireddy
Copy link
Contributor

I definitely like the idea and see the necessity. Just thinking of different ways to implement.
Thinking out loud, another major way to consider is storing the raw spans into a KV store and accessing them for the export feature. By retaining only the most recent spans and cap'ing the KV store, we can make sure we don't replicate the full data and at the same time, we don't increase the Kafka message size drastically.
There could be other ideas too. I'd recommend a feature/design doc for this so that we can finalize on the approach and implement it.
what do you think?

@findingrish
Copy link
Contributor

So we ingest data in Zipkin format in the platform and convert it into RawSpan (our internal format) at entry.

To overcome this issue temporary I created this tool https://github.com/jcchavezs/jaeger2zipkin which allows you to convert jaeger downloaded trace into zipkin trace and then be able to re-ingest it but that isn't 100% reliable because there are two transformations in the middle.

What do we miss in this conversion?
Exploring on these lines, can we add some metadata to RawSpan message and see if this conversion Zipkin <-> RawSpan can be made lossless? Then the download button in ui, just needs to invoke this converter.

I can create a doc to further this discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants