Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
langchain-core 0.2.2 released a standard field to store usage metadata returned from chat model responses, such as input / output token counts. AIMessage objects have a
.usage_metadata
attribute which can hold a UsageMetadata dict. For now it is only holding token counts. Standardizing this information makes it simpler to track in monitoring / observability platforms and similar applications.Here we unpack usage metadata returned by the Bedrock API onto AIMessages generated by chat models.
There are at least two options for implementing this in a streaming context:
"amazon-bedrock-invocationMetrics"
, which we ignore. These data appear standardized, at least for Anthropic and Mistral (I also checked Cohere and Llama3, but streaming for chat models does not work on either currently). We can emit an additional chunk containing these data. The advantage of this is that we may not have to implement any provider-specific processing. The disadvantage is that currently the final chunk contains a"stop_reason"
, and if users are assuming this is the final chunk, this could break workflows.Before:
After: