ChatBedrock: add usage metadata #85

ccurme · 2024-06-25T13:28:31Z

langchain-core 0.2.2 released a standard field to store usage metadata returned from chat model responses, such as input / output token counts. AIMessage objects have a .usage_metadata attribute which can hold a UsageMetadata dict. For now it is only holding token counts. Standardizing this information makes it simpler to track in monitoring / observability platforms and similar applications.

Here we unpack usage metadata returned by the Bedrock API onto AIMessages generated by chat models.

There are at least two options for implementing this in a streaming context:

(Implemented here) Currently, Bedrock streams a final chunk containing usage data in "amazon-bedrock-invocationMetrics", which we ignore. These data appear standardized, at least for Anthropic and Mistral (I also checked Cohere and Llama3, but streaming for chat models does not work on either currently). We can emit an additional chunk containing these data. The advantage of this is that we may not have to implement any provider-specific processing. The disadvantage is that currently the final chunk contains a "stop_reason", and if users are assuming this is the final chunk, this could break workflows.

Before:

content='' response_metadata={'usage': {'input_tokens': [8], 'output_tokens': [1]}}
content='Hello' response_metadata={'stop_reason': None}
content='!' response_metadata={'stop_reason': None}
content='' response_metadata={'stop_reason': 'max_tokens', 'usage': {'output_tokens': [2]}}

After:

content='' response_metadata={'usage': {'input_tokens': [8], 'output_tokens': [1]}}
content='Hello' response_metadata={'stop_reason': None}
content='!' response_metadata={'stop_reason': None}
content='' response_metadata={'stop_reason': 'max_tokens', 'usage': {'output_tokens': [2]}}
content='' usage_metadata={'input_tokens': 8, 'output_tokens': 2, 'total_tokens': 10}

(Implemented in commit history) Implement provider-specific processing, specifically for Anthropic. This is what I did first. Commit 2b9e400 changes to option 1, and if we want we can revert that commit.

ccurme added 6 commits June 25, 2024 08:43

remove xfail

dab8e8a

add to _generate

e4550ab

add to streaming

c7725d5

add test for astream

46c5462

implement for anthropic only

5c1793a

use amazon-bedrock-invocationMetrics

2b9e400

ccurme requested review from 3coins and baskaryan June 25, 2024 15:00

baskaryan approved these changes Jun 25, 2024

View reviewed changes

ccurme merged commit df76647 into main Jun 25, 2024
12 checks passed

ccurme deleted the cc/usage_metadata branch June 25, 2024 16:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ChatBedrock: add usage metadata #85

ChatBedrock: add usage metadata #85

ccurme commented Jun 25, 2024 •

edited

Loading

ChatBedrock: add usage metadata #85

ChatBedrock: add usage metadata #85

Conversation

ccurme commented Jun 25, 2024 • edited Loading

ccurme commented Jun 25, 2024 •

edited

Loading