-
Notifications
You must be signed in to change notification settings - Fork 38.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible changes in web observations #32897
Comments
And I have a question about "active" metrics (for example, http.client.requests.active or http.server.requests.active). Should I create another issue for it? Observation is started at the beginning of the request (server or client). And at this time LongTaskTimer for "xxx.active" metrics with tags is created. But at this time we have only default tag values (for example, http status code) from org.springframework.http.server.reactive.observation.DefaultServerRequestObservationConvention or org.springframework.web.reactive.function.client.DefaultClientRequestObservationConvention. And onStop tags in LongTaskTimer are not updated. Is it a problem? Or that is how it should be? Related code on micrometer: |
Thanks for getting in touch, but it feels like this is a question that would be better suited to Stack Overflow. As mentioned in the guidelines for contributing, we prefer to use the issue tracker only for bugs and enhancements. Feel free to update this issue with a link to the re-posted question (so that other people can find it) or add some more details if you feel this is a genuine bug. If you intended to report an issue, then this should be one issue with instructions to reproduce. As it stands, this issue is not actionable. |
Also see micrometer-metrics/micrometer#5147 As Stéphane pointed out already, you can provide a sample application that demonstrates the problem, if there is one. As for |
This is the example which shows that xxx_active metrics are created only with default values from DefaultServerRequestObservationConvention and DefaultClientRequestObservationConvention (because they are created on start of the observation). (Use App.kt to start the application, TestCheck.kt to query test endpoint, Also it shows that fields in AbstractServerHttpResponse and ServerRequestObservationContext can be accessed from different threads. This is reproduced by using publishOn in the custom filter (TestFilter), but without it some fields are also read/written from different threads. You can see it in debug mode. There has already been the concurrency issue in AbstractServerHttpResponse #27587, where commitActions were changed to CopyOnWriteArrayList. If you call TestCheck with numThreads=100 and numRequests=10000 for example, the metrics with outcome="UNKNOWN" are created (numThreads and numRequests can vary). This is the same issue as #31388 for status code. But the changes were made only for status. That's why if the connection is aborted, outcome becomes UNKNOWN. All of these points are reproduced in Spring boot 3.1.10 and 3.2.4. I don`t know how to reproduce null response in Lines 416 to 426 in 489d18a
Line 160 in 489d18a
But it looks like observation should be stopped here, because it is started in doFirst. Can you please have a look? Or I should publish it on StackOverflow? |
This is effectively conflating 3 different problems in a single issue, we'll try and work from here. |
It's not ideal, but this is the expected behavior. Long task timers collect tags as soon as the obversation is started. In the case of HTTP server observations, nothing much is available at that point besides the HTTP request basics. The only way to collect more data would be to start the observation only when the request is mapped to a controller method - this means that we would not measure HTTP routing for regular timers - that would be incorrect in my opinion. Micrometer made it possible to disable such timers and Spring Boot has a dedicated option for this. "Active" timers make sense in many cases, but I agree that here this has little value. |
It's expected to have objects read/written from different threads as this is the concurrency model chosen by our Reactive stack. State should not be read/written concurrently by different threads though. #27587 did not solve a concurrency problem, see #27587 (comment) #31388 did not solve a concurrency issue either, we have just refined the "UNKNOWN" outcome to be only triggered if the response was not fully written to the network. This reduces the amount of "false positives" in metrics. As you can see, the fix did not change concurrency visibility for state nor switch to a concurrent collection implementation.
Indeed, it seems that using a
I don't believe this is due to concurrent modifications though. In debug mode, I can see that Scheduling the handling of the HTTP exchange on an executor (and not on the Netty workers) is quite surprising to me and this could be the cause of the problem here, especially on a bounded executor. Maybe @simonbasle or @rstoyanchev have an opinion about this? |
Now about the last point in this issue
This is a nullability refinement introduced in c531a8a#diff-564f3e4af414eeaaef0b11ac1d0394443b248683e5000bde0b225576ab991c8c. Effectively, at this point, if the observation has been started, the response cannot be null. So unless you can demonstrate a concrete use case where a problem happens with this, there is nothing to be fixed in my opinion. |
Yes, it did not solve a concurrency issue. This issue fixed UNKNOWN status. But the changes were made only for status (not outcome). That's why if the connection is aborted, outcome becomes UNKNOWN. Is it ok? |
No, it's the opposite. If the server receives a CANCEL signal, this means the connection has been closed/aborted by the client. As a result, we mark the observation context as "connection aborted" and this is later reported as "outcome unknown" in the observation. #31388 merely reduced the number of those by being more lenient: we consider that "outcome: UNKNOWN" should happen when the connection is closed by the client and the response has not been flushed already. In your case, your |
This logic was applied only for status, not outcome. Outcome is always unknown for aborted context. Lines 110 to 117 in e9fcb21
Lines 153 to 160 in e9fcb21
|
Sorry, in my previous comments I should have mentioned "status: UNKNOWN" instead of "outcome: UNKNOWN". This is what #31388 was about. If the response is committed, it means that we're confident that the response status would be what it is (so, not UNKNOWN) even if the client does not receive the response. Now, you're asking about making the "ouctome" not "UNKNOWN" in those cases. If the connection is aborted, this can mean that:
From the server point of view, we cannot know which case it is. What "outcome" value would you use for this case? 1) would be "SUCCESSFUL", but 2) would definitely be a "FAILURE". Please advise. |
On the one hand it seems that "outcome" should correspond "status", because sometimes we receive the metrics with status=200 and outcome=UNKNOWN (in the plain code, without such filters as in the example). But on the other hand it should be unknown (and it may be useful to have this value). Thank you very much for investigating my issue and explaining the details. |
If I'm not mistaken, we have discussed and solved all questions here. |
Hello.
I have some questions about web observation issues.
We updated from Spring Boot 2.x to Spring Boot 3.1.4 and faced the problems with metrics. After investigating we found related issues in micrometer-metrics and spring-web (for example, #31417, #31388, #27587, micrometer-metrics/micrometer#3874). Updating versions solved our problems.
But we found several places, where problems can arise (maybe we are wrong).
Can you please have a look at them?
Thanks!
spring-framework/spring-web/src/main/java/org/springframework/web/server/adapter/HttpWebHandlerAdapter.java
Lines 416 to 426 in 489d18a
spring-framework/spring-web/src/main/java/org/springframework/web/filter/reactive/ServerHttpObservationFilter.java
Line 160 in 489d18a
spring-framework/spring-web/src/main/java/org/springframework/http/server/reactive/AbstractServerHttpResponse.java
Line 64 in 489d18a
spring-framework/spring-web/src/main/java/org/springframework/http/server/reactive/AbstractServerHttpResponse.java
Line 75 in 489d18a
spring-framework/spring-web/src/main/java/org/springframework/http/server/reactive/observation/ServerRequestObservationContext.java
Lines 53 to 55 in 489d18a
spring-framework/spring-web/src/main/java/org/springframework/http/server/reactive/observation/DefaultServerRequestObservationConvention.java
Lines 162 to 163 in 489d18a
spring-framework/spring-web/src/main/java/org/springframework/http/server/reactive/observation/DefaultServerRequestObservationConvention.java
Lines 119 to 120 in 489d18a
The text was updated successfully, but these errors were encountered: