Extends GraphQL Spans #562

PascalSenn · 2023-11-27T16:14:41Z

Fixes #

Changes

This pull requests extends the capabilities of graphql open telemetry.

It adds a span for the execution of a GraphQL operation and a span for the resolvers.

I left a few comments in the definition to open a dicussion about the changes.

Merge requirement checklist

CONTRIBUTING.md guidelines followed.
CHANGELOG.md updated for non-trivial changes.
schema-next.yaml updated with changes to existing conventions.

linux-foundation-easycla · 2023-11-27T16:14:45Z

The committers listed above are authorized under a signed CLA.

✅ login: PascalSenn (481e414, 2d61ae4, 9a72c98, 19d80b7, eade875, 43518fc, 186ea98)

PascalSenn · 2023-11-27T16:23:07Z

model/trace/instrumentation/graphql.yml

+        brief: "The number of errors that occurred during the operation."
+        type: int
+        examples: 3
+    # TODO should we have something like outcome (success, failure)


I was not sure if we should have a property that signals if the execution was a success or a failure.
Then we would have to specify aswell what a 'success' and what a 'failure' is.
As a operation can fail completely ('data' is null) or partially (there are errors)

If it's an exception we are talking about here, there's already https://github.com/open-telemetry/semantic-conventions/blob/v1.23.0/docs/exceptions/exceptions-spans.md

In general if the request fails on the client side, the span status then is marked as error. See here for some example (for HTTP) https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md#status

PascalSenn · 2023-11-27T16:25:35Z

model/trace/instrumentation/graphql.yml

+        type: int
+        examples: 3
+    # TODO should we have something like outcome (success, failure)
+    # TODO how do we specify errors?


A GraphQL Operation can have multiple errors in the response:

{ "data": null, "errors": [ { "message": "Cannot return null for non-nullable field User", "locations": [ { "line": 3, "column": 5 } ], "path": [ "createUser", "user" ], "extensions": { "code": "USERNAME_INVALID", "message": "username is invalid" } } ] }

What is the best way to represent this in OpenTelemetry?

I'd say maybe use the attributes under the exception namespace: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/attributes-registry/exception.md

PascalSenn · 2023-11-27T16:27:02Z

model/trace/instrumentation/graphql.yml

+        examples: 3
+    # TODO should we have something like outcome (success, failure)
+    # TODO how do we specify errors?
+    # TODO Should we ref network.transport, network.type, server.address etc?


This would have to be done with caution though. as graphql is protocol agonstic

But those are still very protocol agnostic, no?

PascalSenn · 2023-11-27T16:28:26Z

model/trace/instrumentation/graphql.yml

+    # TODO should we have something like outcome (success, failure)
+    # TODO how do we specify errors?
+    # TODO Should we ref network.transport, network.type, server.address etc?
+    # TODO There are more spans like validation, parsing, variable coercion and response formatting. Should we add them as separate span types?


It provides a lot of value to know how long a specific operation is spending validating, parsing or formatting.
Should we add specific spans for this?

We can model how the trace should like as we want - One thing to always keep in mind is verbosity. Spans costs money in the end. Creating too many may be a problem in some cases, so if it's not essential, maybe they can be turned off or opt-in.

If so, please open a separated issue explaining the value/what it should do so we can tackle it in a separate PR.

PascalSenn · 2023-11-27T16:29:56Z

model/trace/instrumentation/graphql.yml

@@ -1,6 +1,6 @@
 groups:
-  - id: graphql
-    prefix: graphql
+  - id: graphql.server.request


This are just the graphql.server spans. Should we also specific the graphql.client spans?

If it is desired, yes we should. But most likely not as part of this PR. Also, maybe it's good to create an issue so maybe discussions can happen there.

PascalSenn · 2023-11-27T16:38:45Z

model/trace/instrumentation/graphql.yml

+      - id: selection.field.isDeprecated
+        brief: "Whether the field that is beeing resolved is deprecated."
+        type: bool
+        examples: true


What do we do with validation errors? If someone sends a invalid GraphQL requests that e.g. could not be parsed. Which span should be emitted

Same span as you would do, but in the case of an error, you set the exception attributes where the error can be recorded (as well as the span status). See HTTP for example https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md

PascalSenn · 2023-11-27T16:42:01Z

model/trace/instrumentation/graphql.yml

+      - id: selection.field.isDeprecated
+        brief: "Whether the field that is beeing resolved is deprecated."
+        type: bool
+        examples: true


Another thing that is not yet specified are the subscriptions.

A subscription is a long running operation that returns an event stream. Similar to websockets/signalR etc.

Is there prior art in open telemetry how this could be handled?

A subscription starts with a "Subscribe" call that returns a event stream. Each event in this event stream is mapped to a graphql result. This means that, the subscribe call and the execution of each event, give different insights. These insights are more important than a arbitrary long root spans that spans the whole graphql execution.

tobias-tengler

Just a small fix to keep the example of Query.findBookById consistent.

model/trace/instrumentation/graphql.yml

Co-authored-by: Tobias Tengler <[email protected]>

github-actions · 2024-01-25T03:19:58Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

github-actions · 2024-02-02T03:19:08Z

Closed as inactive. Feel free to reopen if this PR is still being worked on.

PascalSenn · 2024-07-01T19:25:36Z

@jsuereth Can this Pull Request be reopened? How can i get feedback for this?

github-actions · 2024-10-04T03:21:19Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

PascalSenn · 2024-10-04T17:32:01Z

bump

lmolkova · 2024-10-11T17:02:42Z

Hi @PascalSenn,

Sorry for not reviewing the PR for a long time - unfortunately we don't have any GraphQL experts among semantic conventions approvers.

We have some folks who contributed #1389 recently and I wonder if they could give this one a review as well.

@kaylareopelle @robertlaurin would you mind taking a look and sharing ant feedback?

Thanks!

PS: happy to review from general semconv perspective sometime next week.

joaopgrassi · 2024-10-28T06:25:42Z

model/trace/instrumentation/graphql.yml

@@ -1,6 +1,6 @@
 groups:
-  - id: graphql
-    prefix: graphql
+  - id: graphql.server.request


If it is desired, yes we should. But most likely not as part of this PR. Also, maybe it's good to create an issue so maybe discussions can happen there.

joaopgrassi · 2024-10-28T06:28:42Z

model/trace/instrumentation/graphql.yml

+          persisted queries.
+        type: string
+        examples: "aa3e37c1bf54708e93f12c137afba004"
+      - id: error.count


Shouldn't this be a generic "request count/duration" metric? With the error type attribute then one can find out how many errors occurred. https://github.com/open-telemetry/semantic-conventions/blob/v1.23.0/docs/http/http-metrics.md#metric-httpserverrequestduration

joaopgrassi · 2024-10-28T06:33:44Z

model/trace/instrumentation/graphql.yml

+        brief: "The number of errors that occurred during the operation."
+        type: int
+        examples: 3
+    # TODO should we have something like outcome (success, failure)


If it's an exception we are talking about here, there's already https://github.com/open-telemetry/semantic-conventions/blob/v1.23.0/docs/exceptions/exceptions-spans.md

In general if the request fails on the client side, the span status then is marked as error. See here for some example (for HTTP) https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md#status

joaopgrassi · 2024-10-28T06:34:45Z

model/trace/instrumentation/graphql.yml

+        type: int
+        examples: 3
+    # TODO should we have something like outcome (success, failure)
+    # TODO how do we specify errors?


I'd say maybe use the attributes under the exception namespace: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/attributes-registry/exception.md

joaopgrassi · 2024-10-28T06:35:40Z

model/trace/instrumentation/graphql.yml

+        examples: 3
+    # TODO should we have something like outcome (success, failure)
+    # TODO how do we specify errors?
+    # TODO Should we ref network.transport, network.type, server.address etc?


But those are still very protocol agnostic, no?

joaopgrassi · 2024-10-28T06:37:19Z

model/trace/instrumentation/graphql.yml

+    # TODO should we have something like outcome (success, failure)
+    # TODO how do we specify errors?
+    # TODO Should we ref network.transport, network.type, server.address etc?
+    # TODO There are more spans like validation, parsing, variable coercion and response formatting. Should we add them as separate span types?


We can model how the trace should like as we want - One thing to always keep in mind is verbosity. Spans costs money in the end. Creating too many may be a problem in some cases, so if it's not essential, maybe they can be turned off or opt-in.

If so, please open a separated issue explaining the value/what it should do so we can tackle it in a separate PR.

joaopgrassi · 2024-10-28T06:41:21Z

model/trace/instrumentation/graphql.yml

+  - id: graphql.server.resolver
+    prefix: graphql
+    type: span
+    brief: >
+      This document defines semantic conventions to apply when instrumenting the GraphQL implementation. 
+      They map GraphQL resolvers to attributes on a Span.
+    attributes:
+      - id: selection.name
+        brief: "The name of the selection that is being resolved. Either the field name or an alias."
+        type: string
+        examples: "findBookById"
+      - id: selection.type # selection.field.type ?
+        brief: "The type of the field that is beeing resolved"
+        type: string
+        examples: "Book"
+      - id: selection.path
+        brief: "The path of the selection that is beeing resolved."
+        type: string
+        examples: "/foo/bar/0/baz"
+      - id: selection.field.name
+        brief: "The name of the field that is beeing resolved."
+        type: string
+        examples: "findBookById"
+      - id: selection.field.declaringType
+        brief: "The type that declares the field that is beeing resolved."
+        type: string
+        examples: "Query"
+      - id: selection.field.coordinate
+        brief: "The coordinate of the field that is beeing resolved."


All attributes should be defined in the attributes registry, not directly where it is used. https://github.com/open-telemetry/semantic-conventions/blob/main/CONTRIBUTING.md#1-modify-the-yaml-model

Then here you use ref to use them.

joaopgrassi · 2024-10-28T06:42:29Z

model/trace/instrumentation/graphql.yml

+      - id: selection.field.isDeprecated
+        brief: "Whether the field that is beeing resolved is deprecated."
+        type: bool
+        examples: true


Same span as you would do, but in the case of an error, you set the exception attributes where the error can be recorded (as well as the span status). See HTTP for example https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md

PascalSenn added 2 commits October 2, 2023 23:16

Adds more graphql traces

eade875

Adds more comments

186ea98

PascalSenn requested review from a team November 27, 2023 16:14

github-actions bot assigned jsuereth Nov 27, 2023

PascalSenn marked this pull request as draft November 27, 2023 16:15

PascalSenn added 2 commits November 27, 2023 17:17

Align formatting

19d80b7

Align formatting

43518fc

PascalSenn commented Nov 27, 2023

View reviewed changes

Merge branch 'main' into pse/add-more-graphql-traces

2d61ae4

PascalSenn commented Nov 27, 2023

View reviewed changes

tobias-tengler reviewed Nov 27, 2023

View reviewed changes

model/trace/instrumentation/graphql.yml Outdated Show resolved Hide resolved

model/trace/instrumentation/graphql.yml Outdated Show resolved Hide resolved

PascalSenn and others added 2 commits November 27, 2023 21:02

Apply fix

481e414

Co-authored-by: Tobias Tengler <[email protected]>

Apply Fix

9a72c98

Co-authored-by: Tobias Tengler <[email protected]>

dotansimha approved these changes Dec 18, 2023

View reviewed changes

github-actions bot added the Stale label Jan 25, 2024

github-actions bot closed this Feb 2, 2024

lmolkova reopened this Jul 1, 2024

PascalSenn marked this pull request as ready for review September 12, 2024 18:33

github-actions bot assigned reyang Sep 12, 2024

PascalSenn mentioned this pull request Sep 12, 2024

Improve GraphQL semantic conventions #182

Open

github-actions bot removed the Stale label Sep 13, 2024

github-actions bot added the Stale label Oct 4, 2024

github-actions bot removed the Stale label Oct 5, 2024

lmolkova unassigned jsuereth and reyang Oct 11, 2024

lmolkova added the experts needed This issue or pull request is outside an area where general approvers feel they can approve label Oct 11, 2024

lmolkova added the area:graphql label Oct 11, 2024

joaopgrassi requested changes Oct 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extends GraphQL Spans #562

Extends GraphQL Spans #562

PascalSenn commented Nov 27, 2023 •

edited

Loading

linux-foundation-easycla bot commented Nov 27, 2023 •

edited

Loading

PascalSenn Nov 27, 2023

joaopgrassi Oct 28, 2024

PascalSenn Nov 27, 2023

joaopgrassi Oct 28, 2024

PascalSenn Nov 27, 2023

joaopgrassi Oct 28, 2024

PascalSenn Nov 27, 2023

joaopgrassi Oct 28, 2024

PascalSenn Nov 27, 2023

joaopgrassi Oct 28, 2024

PascalSenn Nov 27, 2023

joaopgrassi Oct 28, 2024

PascalSenn Nov 27, 2023

tobias-tengler left a comment

github-actions bot commented Jan 25, 2024

github-actions bot commented Feb 2, 2024

PascalSenn commented Jul 1, 2024

github-actions bot commented Oct 4, 2024

PascalSenn commented Oct 4, 2024

lmolkova commented Oct 11, 2024

joaopgrassi Oct 28, 2024

joaopgrassi Oct 28, 2024

joaopgrassi Oct 28, 2024

joaopgrassi Oct 28, 2024

joaopgrassi Oct 28, 2024

joaopgrassi Oct 28, 2024

joaopgrassi Oct 28, 2024

joaopgrassi Oct 28, 2024

Extends GraphQL Spans #562

Are you sure you want to change the base?

Extends GraphQL Spans #562

Conversation

PascalSenn commented Nov 27, 2023 • edited Loading

Changes

Merge requirement checklist

linux-foundation-easycla bot commented Nov 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tobias-tengler left a comment

Choose a reason for hiding this comment

github-actions bot commented Jan 25, 2024

github-actions bot commented Feb 2, 2024

PascalSenn commented Jul 1, 2024

github-actions bot commented Oct 4, 2024

PascalSenn commented Oct 4, 2024

lmolkova commented Oct 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PascalSenn commented Nov 27, 2023 •

edited

Loading

linux-foundation-easycla bot commented Nov 27, 2023 •

edited

Loading