-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instrumentation hooks for trace context propagation #48
Comments
In my mind, I think it's great that OpenTelemetry is trying to standardize more in this space but I also think the reality of the situation is that those standards are not stabilized, and who knows what it'll look like in 5 years. Given that, I think it's even more powerful for This gives us the best chance that the instrumentation itself is stable and can live on (since it's the thing that will be most interwoven in numerous codebases across the ecosystem), while the handlers can evolve to match the latest standards and best practices for observability. For example - imagine the |
This opentelemetry RFC might be useful when designing this open-telemetry/oteps#42 I'm not sure it belongs in |
Why wouldn't it belong in I don't think there's much to this API really - it's not trying to solve all the problems regarding context propagation, it's simply providing instrumentation hooks that can be widely adopted such that it's easy for a people to build & evolve solutions to context propagation w/o requiring code changes throughout the ecosystem. |
So from more technical point of view; |
Exactly |
Assuming that such
I think option c) fits the best the tracing use case because it does not break the critical path of the call and still applies other handlers.
|
|
I misunderstood this at first. Based on the Observability WG meeting discussion I know understand this to be like if an HTTP library had a callback |
My reservation on this proposal is that it would not be a good idea for libraries, such as an HTTP client, to assume that we want to pass context forward. So I think that it has be explicitly passed in. There may still be a need for standardization but I think it would be a bit harmful if libraries start assuming all context is shared. |
I'm inclined to agree with José on this. How would the client be able to discern at the library level when you intend to propagate context within your systems versus external calls where you may not want to leak that? It would seem that explicitly declaring behavior at the client implementation level gives the most control. I used the separate clients strategy in our systems and it worked well. |
I agree that I would not want context injected in all HTTP calls. I would want to have to do it explicitly, or perhaps by using a special client module where I clearly know that it has this behavior. One pretty common pattern is to build a module in your app code that’s specifically for talking to another external service. That gives a pretty easy point at which you can decide whether you want to inject context for that particular call. For example: https://github.com/spandex-project/spandex_example/blob/master/plug_gateway/lib/plug_gateway/backend_client.ex#L8 |
The approach the last few of you point to is definitely workable, but it foregoes any solution that works "out of the box" New Relic agents try to wire these types of things up for the user to make it such that achieving observability is as easy as possible. We take care of any security concerns by:
Given the push-back, I'm assuming that with other systems, people are passing sensitive information in the trace context itself? That itself feels dangerous to me but perhaps it's a pattern out there that people use? Leaving it all up to the user presents challenges as soon as we move outside of HTTP. Take for example database calls. There's no concept of "headers" so what some tracing packages do is add a comment to the SQL query itself that serializes the context (ex: https://docs.vividcortex.com/general-reference/query-tags/). I'm not clear how that would be achieved by the end user... |
I think most of the time, there's nothing particularly sensitive in the trace context (just trace/span IDs). It's possible to include arbitrary "baggage" data which would probably be more sensitive, so there would need to be a way to only return the non-sensitive information at least, which feels like it adds some complexity. I think the main issue I have is that I typically don't want magic things to happen that I didn't explicitly ask for and can't easily tell where they're coming from (this is one of the main reasons I like Elixir), so I prefer to have to manually inject trace context where I want it. That said, I think I'm getting off-topic on this thread because that's only one possible use for this proposed API. My main concern about the proposed API itself (as opposed to the intended use-case), is that it feels weird and potentially hard to troubleshoot what's going on when you're making a blind call to some unknown number of functions that have registered themselves to listen to that hook and have an opportunity to transform the data somehow and in an undefined order. By comparison, it would be a lot more clear what's going on if the HTTP client did something like how Tesla supports middleware for injecting trace context in request headers, or parsing some kind of context or metadata from the response. |
That's true but it doesn't seem within the scope of telemetry to try and enforce a middleware pattern on any library that wants to add this kind of instrumentation... I see that there's a variety of concerns with this, I'm going to ponder this for a while and see if I can come up with another approach |
Yeah the thing I'm struggling with is that I don't want to build a simpler-but-worse version of OpenTelemetry, because maybe then the answer is that library authors should just use the OpenTelemetry API dependency, since it's only a lightweight API and doesn't actually do anything unless the user chooses to include the SDK/implementation. The downside with that is that then there's some confusion about how to handle some libraries directly using OTel and others using Telemetry - would we need an OTel exporter that injects things back into Telemetry? 😬 At least for the cases of New Relic / AppSignal / Scout / etc. who want to have an easy integration for their customers, it seems like the right path is a deeper OTel integration, and it's "just" a matter of getting library authors to call It seems like what you're proposing here is a more-generic way for libraries to register that they would like to be involved whenever another library has an opportunity to inject HTTP headers (you're specifically suggesting If we're concerned about being HTTP-specific, then maybe it could be something like |
Yeah, this is exactly what I'm trying to accomplish. |
I'm thinking a bit about this again... It seemed like one major concern was that some systems might not want to "assume" that every outgoing call should pass the context around. I'd argue that this kind of decision should be the responsibility of the telemetry handler (ie: APM / observability library) that is attached to the I think it's important to make a clear distinction between the instrumentation hooks that will be built into libraries & the functionality of observability & monitoring... Instrumentation should be simple and stable - agnostic to most concerns so that it can survive in-tact as things around it change. That gives the observability solution the most flexibility & leaves it responsible for the job it's good at. Other concerns relating to the structure of data that would be returned / handling error cases / etc all seem solvable once we decide this is a path we want to take. We can structure it such that we provide enough safety for this to be relied upon, similarly to how existing handlers are managed & detached upon failure... |
I'm interested in seeing
telemetry
extended with an API to enable an additional type of instrumentation - the mutation of an outbound request / query with Distributed Trace context information.The general API could be implemented as an instrumentation hook that a library could build into their code that enables
telemetry
handlers to mutate specific data structures.Taking HTTP requests as an example, we could imagine an HTTP client library adding
telemetry
hooks:Then a handler could
attach
to[:httpoison, :request, :headers]
, and do some work in the handler...This would provide a general mechanism for outbound instrumentation, without hard-wiring to any vendor-specific instrumentation details.
Very open to discussions about strategy and naming!
Issues to sort out:
For context, this issue has been discussed before in #24 and beam-telemetry/telemetry_metrics#46
The text was updated successfully, but these errors were encountered: