End of http.client.duration span: headers received vs. body received #3519

antonfirsov · 2023-05-24T20:03:37Z

antonfirsov
May 24, 2023

Another question regarding the ongoing implementation of http.client.duration in .NET (dotnet/runtime#84978, dotnet/runtime#85447):

.NET's HttpClient.SendAsync can operate in 2 modes:

With HttpCompletionOption.ResponseContentRead the call will buffer the entire response body into the response object.
With HttpCompletionOption.ResponseHeadersRead the call will return as soon as the response headers are read. It's the users responsibility then to read and parse the response body from the response object's stream. This enables (potentially very long running) streaming use-cases.

This leads to the question: when exactly should we report the end of an HTTP span when collecting http.client.duration.

When the response headers are read?
When the response content is read, including the HttpCompletionOption.ResponseHeadersRead case, when it's the caller's responsibility to read the response body stream? (Potentially very long running operation.)
At different times in each of the two modes: when HttpCompletionOption.ResponseHeadersRead is used, report the end of the span when the headers are read. Otherwise report it when the whole body is read.

We would like to receive guidance on this question!

/cc @JamesNK @lmolkova @noahfalk

lmolkova · 2023-05-24T20:39:43Z

lmolkova
May 24, 2023
Maintainer

great questions!

@trask @mateuszrzeszutek I wonder what we're doing on Java clients and if we found some good patterns there?

My understanding is that with HttpCompletionOption.ResponseHeadersRead and similar cases outside of .NET, it's prohibitively hard and expensive to take into account stream reading. We'd need to wrap the network stream and depend on the user actually reading it or disposing of it at the right time to end the span or measure duration.

Measured duration would be a function of how user reads the stream rather than network characteristics and in the case of WebSockets or gRPC streaming it would become useless. and Activity.Current behavior would also be questionable with code like below

var response = await client.SendAsync(request);
// do something long and potentially failing here
var stream = response.Content.ReadAsStream();
while (canRead) {
  var line = await readNextLine(stream);
  // do something long and potentially failing here
}

My proposal is to end span (measure duration) when the inner client handler returns the response.

Users or other libraries that use HTTP clients can add a logical encompassing operation and corresponding metrics that would track HTTP requests with all the tries and stream reading.

I.e. get something like

Storage.downloadBlob - logical operation - ok
- HTTP GET - ok
- Storage.readChunk - failure at offset 12345 (note that HTTP request was successful)
- HTTP GET - ok - retrying logical operation
- Storage.readChunk - ok

Then the only problem is the inconsistency between ResponseHeadersRead and ResponseContentRead modes where duration might be quite different depending on the configuration.

Assuming we can always measure duration when headers are read (regardless of the mode), then users would see gaps in traces (e.g. here)

They would see that span has ended before the call to HTTP client has returned and would not know what happened there.

So, even though it's not perfect, I think ending span/recording duration when inner handler returns regardless of the mode is the best we can do. We should also document it and explain that duration depends on the mode and measures inner handler response time.
Additionally, we can entertain the idea of instrumenting network streams, but there is nothing in the otel for now.

1 reply

trask May 24, 2023
Maintainer

Java HTTP client instrumentation ends the HTTP client span before any user callbacks are executed (so that user code execution won't be included in the HTTP client span duration)

I believe this matches @lmolkova's proposal:

My proposal is to end span (measure duration) when the inner client handler returns the response.

JamesNK · 2023-05-25T00:38:59Z

JamesNK
May 25, 2023

Measured duration would be a function of how user reads the stream rather than network characteristics and in the case of WebSockets or gRPC streaming it would become useless.

Why would it be useless? It's an accurate measurement of the duration of the HTTP request. The request is still happening - a TCP/UDP socket is open in the app and sends and receives data. In HTTP/2, the active stream is still counted against the number of active streams for that H2 connection. And so on. Also, the server still has the request open on its side. It would be strange if, for example, a server reports it has 10 active requests while the client has 0.

In my opinion, duration should last until the client reads the response stream to completion (graceful completion) or the request is aborted by the client/server for some reason (explicit client abort, timeout, server abort, etc). This makes sense from the perspective that it's the real duration of the request. That behavior is consistent with other tooling, such as the network view in a browser, which measures a request from when it starts to when it's finished downloading.

If you consider a request ended when the response headers are received on the client, does the request end on the server when it sends the response headers? ASP.NET Core has added counters like http.server.active_requests and http.server.request_duration. They continue until the server has finished with the request and released all its resources, not when the server starts sending the response.

0 replies

lmolkova · 2023-05-25T04:16:20Z

lmolkova
May 25, 2023
Maintainer

Why would it be useless? It's an accurate measurement of the duration of the HTTP request. The request is still happening - a TCP/UDP socket is open in the app and sends and receives data. In HTTP/2, the active stream is still counted against the number of active streams for that H2 connection

Ok, let me be more accurate. Its usefulness heavily depends on the user application. Its duration and failure rate factor in how the application reads the stream and in general case we can't say what exactly it represents.

In the perfect world, we should to be able to measure both things: time to first and time to last byte then there are the following questions:

can we reliably measure time to first byte
- Can we do it in .NET in ResponseContentRead mode?
can we reliably measure time to last byte
- I believe the answer to q2 is no. Maybe we can do it in .NET keeping perf overhead reasonable (can we?), but it'd be much harder with instrumentations that monkey-patch or have very limited control over things.
which one out of two HTTP request duration represents
- it seems neither (because today p1 and p2 are not possible). So it represents the duration of a client call only (outside of user application code). It also feel consistent with general tracing approach where we measure user interaction with a library/framework rather than the underlying stack.

Assuming we can find a reasonable approach to measure time to the first and last byte, I'm happy to entertain options and brainstorm how to represent them both.

1 reply

lmolkova May 25, 2023
Maintainer

If you consider a request ended when the response headers are received on the client, does the request end on the server when it sends the response headers?

same for the server - if the server only measures time to last byte, these times heavily depend on the client network and how they read the stream. It's not bad, it's just not enough to provide observability into where the problem is (slow server or slow network)

JamesNK · 2023-05-25T06:02:05Z

JamesNK
May 25, 2023

can we reliably measure time to first byte

In .NET, yes. The first bytes returned from the response are always response HTTP headers. Getting that information is easy.

can we reliably measure time to last byte

In .NET, we believe so. We're investigating wrapping the response content and stream and observing when it is disposed of or read to the end. If we can't get that information today, .NET could investigate adding a new API to our HttpClient which makes it possible to observe the end of a HTTP request.

It also feel consistent with general tracing approach where we measure user interaction with a library/framework rather than the underlying stack.

But the library/framework almost always needs the response body to use the HTTP request. Assuming we're making a RESTful API call to a product API: stopping the duration timer at the point when HTTP headers are returned isn't the point when the request can be used by the client. You then need to download the JSON response body that contains the product information. That might be a kilobyte of JSON and happen quickly. Or it might be megabytes of JSON and take some time.

If the request duration timer stops when response headers are received, and someone is downloading large JSON responses, or their network is slow, then they're getting bad data. Metrics would tell them that HTTP request durations are short. They will think that HTTP requests must not be why my app is slow. While in reality, the HTTP request to download a large file is taking 20ms to get the first bytes, and then 30 seconds to download the remaining 100 MB.

same for the server - if the server only measures time to last byte, these times heavily depend on the client network and how they read the stream. It's not bad, it's just not enough to provide observability into where the problem is (slow server or slow network)

The duration still includes the time of sending the request, e.g. a POST with a 200KB JSON payload, and whatever latencies are involved in the network. The overhead of the network isn't being eliminated, just the response download overhead.

Whether the server is slow or the network is slow still adds up to the same thing: the HTTP client in the app is making HTTP requests, and they're taking a long time.

1 reply

lmolkova May 25, 2023
Maintainer

On .NET we can measure both - cool!

It's not only a question of knowing if something is slow, we also need to know why. What I'm saying is that both times are important and no need to pick one or the other - let's define what HTTP duration means out of two and record both.

If we only record time to the first byte users/frameworks will need to write custom code to measure time to last byte, which might be challenging on server side.

If we only record time to the last byte:

on client side, users won't be able to measure time to first byte at all (unless they want to change to headers mode). So we have to measure it.
on server side, users will need write custom middleware/filter/etc to measure their code performance

Because of p1, and general complications outside of .NET to measure time to last byte, I argue that HTTP request duration should represent time-to-first-byte. And we should keep it consistent on the server side.

If it makes sense, we can add time-to-last-byte as a new layer but need to define the semantics.

antonfirsov · 2023-05-25T16:31:27Z

antonfirsov
May 25, 2023
Author

Reading through the discussions above, IMHO this is a serious ambiguity in the specs, so I decided to open an issue: #3520.

2 replies

lmolkova May 25, 2023
Maintainer

agreed and thanks for creating the issue! I added this topic to the agenda for Semconv WG on Monday 10am, feel free to join if you can

antonfirsov May 25, 2023
Author

Unfortunately I won't be able to make it to the meeting, but I really appreciate the invitation :) I think my position is mostly clear from the issue description: if http.client.duration is representing "time to first byte", it would be nice to have another standard metric for the whole HTTP communication span. I tend to agree with @JamesNK that it feels weird, if there's only one duration metric for which the timer is expected to stop while the actual HTTP request is still open.

noahfalk · 2023-05-26T22:17:52Z

noahfalk
May 26, 2023

Just wanted to add in to make sure we don't consider the metric in isolation. Right now in HttpClient we have metrics, logs, and distributed tracing all of which have some measurement of time included. Presumably we want all the signals to agree on what abstraction they are measuring, or we should use distinct naming to make it clear we are measuring using different abstractions.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

End of http.client.duration span: headers received vs. body received #3519

{{title}}

Replies: 6 comments 5 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

End of http.client.duration span: headers received vs. body received #3519

antonfirsov May 24, 2023

Replies: 6 comments · 5 replies

lmolkova May 24, 2023 Maintainer

trask May 24, 2023 Maintainer

JamesNK May 25, 2023

lmolkova May 25, 2023 Maintainer

lmolkova May 25, 2023 Maintainer

JamesNK May 25, 2023

lmolkova May 25, 2023 Maintainer

antonfirsov May 25, 2023 Author

lmolkova May 25, 2023 Maintainer

antonfirsov May 25, 2023 Author

noahfalk May 26, 2023

antonfirsov
May 24, 2023

Replies: 6 comments 5 replies

lmolkova
May 24, 2023
Maintainer

trask May 24, 2023
Maintainer

JamesNK
May 25, 2023

lmolkova
May 25, 2023
Maintainer

lmolkova May 25, 2023
Maintainer

JamesNK
May 25, 2023

lmolkova May 25, 2023
Maintainer

antonfirsov
May 25, 2023
Author

lmolkova May 25, 2023
Maintainer

antonfirsov May 25, 2023
Author

noahfalk
May 26, 2023