Change 'metadata.cloud.project.*' for GCP, clarify other fields #822
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR changes the spec for two
cloud.project.*
metadata fields for GCP to be consistent with how Beats collect the same metadata. It was motivated by a user that is unable to correlate documents from APM agents and Beats oncloud.project.id
.It also clarifies a couple things with GCP metadata collection. Finally there is an open question about adding a new field.
details
A Google cloud project has three identifiers. The CLI lists them as:
and the GCP docs describe them here: https://cloud.google.com/resource-manager/docs/creating-managing-projects#before_you_begin
proposed changes
cloud.project.id
to use GCE metadataproject.projectId
, instead ofproject.numericProjectId
(aka "project number"). This will match what Beats do. As well, theprojectId
field is a more visible identifier for the user -- it is more prominently shown in the Google Cloud console.cloud.project.name
. What Google describes as the "Project name" is not available in the GCE metadata response. The APM agents should not be attempting to make GCP API requests to get the project name.instance.id
from the GCE metadata is a big int. Your JSON parser could be surprised by it -- the JavaScript one gets it wrong, the Python one is fine. Please check.cloud.region
was wrong. (The Python implementation linked to earlier in the metadata.md doc is correct.)open question
(This is for discussion. I have not proposed changing this in this PR.)
Should we also set
cloud.account.id
to the<GCE metadata>.project.projectId
?Beats do.
ECS says:
The "Examples: Google Cloud ORG Id" seems to indicate using
projectId
would be inappropriate. However, I think GCP uses the project (not some concept of the account) to "identify" resources.Some examples from the GCE metadata (the globally unique identifier there is the
numericProjectId
):For comparison the OTel cloud semconv do not have
cloud.project.*
(perhaps because they are concepts particular to GCP and Azure). The do havecloud.account.id
and at least the OTel JS SDK sets that to the GCE metadataproject-id
value.I am curious if/where the ECS + OTel semconv merge lands on this.
checklist
May the instrumentation collect sensitive information, such as secrets or PII (ex. in headers)?CODEOWNERS
)To auto-merge the PR, add
/
schedule YYYY-MM-DD
to the PR description./schedule 2023-09-05