Typing: when stream is completed, delta in ChatCompletionChunk from azure openai is None; should be ChoiceDelta #1677

JensMadsen · 2024-08-26T15:03:40Z

Confirm this is an issue with the Python library and not an underlying OpenAI API

This is an issue with the Python library

Describe the bug

When streaming from azure open ai API the delta of the choice is None. In the python open ai client v1.42.0 delta is type ChoiceDelta i.e. not None.

To Reproduce

Run this code in line with

    completion = await self._client.chat.completions.create(
            model=self.deployment.name,
            messages=cast(list[ChatCompletionMessageParam], messages),
            stream=True,
            temperature=temperature,
        )

    async for response_chunk in completion:
        ...

The types are:
response_chunk: ChatCompletionChunk
response_chunk.choices: list[Choice]
response_chunk.choices[0].delta: ChoiceDelta

The response from azure open ai API returns delta=Nonewhen stream ends

Response example:

Choice(delta=None, finish_reason=None ...........)

Code snippets

No response

OS

linux, ubuntu 20.04

Python version

3.12.1

Library version

openai v 1.42.0

The text was updated successfully, but these errors were encountered:

kristapratico · 2024-08-26T17:17:00Z

@JensMadsen could you share more information to help in reproducing this?

what is the model you are using?
which Azure OpenAI API version?
what kind of deployment - standard, global, provision-managed?

JensMadsen · 2024-08-27T07:07:10Z

@JensMadsen could you share more information to help in reproducing this?

what is the model you are using?

which Azure OpenAI API version?

what kind of deployment - standard, global, provision-managed?

@kristapratico I think I have identified what causes the incorrect types. I use the 2024-05-01-preview azure API version (to use the assistants api). When I switch back to 2023-05-15 it works as expected. I also see the type mismatch in e.g. API version 2024-06-01. I have not thoroughly tested with all versions i.e. see: https://learn.microsoft.com/en-us/azure/ai-services/openai/api-version-deprecation.

I use gpt-4o
deployment is standard

kristapratico · 2024-08-27T20:21:59Z

@JensMadsen thanks. Unfortunately, I'm still missing something to reproduce this. Could you share the region your resource resides in and/or the prompt that causes this?

edit: Do you by chance have a custom content filter applied to the deployment with asynchronous filtering enabled?

JensMadsen · 2024-08-28T07:03:59Z

@JensMadsen thanks. Unfortunately, I'm still missing something to reproduce this. Could you share the region your resource resides in and/or the prompt that causes this?

edit: Do you by chance have a custom content filter applied to the deployment with asynchronous filtering enabled?

Yes, of course.

Region: Sweden
We have a content filter that I think i "custom" (see screenshot):

I see this with all prompts so far.

Again, using the older API 2023-05-15 results in responses aligned with the types in the open ai python client.

kristapratico · 2024-08-28T17:20:46Z

@JensMadsen got it. In your screenshot, it does look like the asynchronous content filter is enabled. With the async filter turned on, the Azure response is slightly altered to return information like content_filter_results and content_filter_offsets in the first and final streamed chunk (and omit sending delta):

data: {"id":"","object":"","created":0,"model":"","prompt_annotations":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"choices":[],"usage":null} 

data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"role":"assistant"}}],"usage":null} 

data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":"Color"}}],"usage":null} 

data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":" is"}}],"usage":null} 

data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":null,"delta":{"content":" a"}}],"usage":null} 

... 

data: {"id":"","object":"","created":0,"model":"","choices":[{"index":0,"finish_reason":null,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"content_filter_offsets":{"check_offset":44,"start_offset":44,"end_offset":198}}],"usage":null} 

... 

data: {"id":"chatcmpl-7rCNsVeZy0PGnX3H6jK8STps5nZUY","object":"chat.completion.chunk","created":1692913344,"model":"gpt-35-turbo","choices":[{"index":0,"finish_reason":"stop","delta":{}}],"usage":null} 

data: {"id":"","object":"","created":0,"model":"","choices":[{"index":0,"finish_reason":null,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"content_filter_offsets":{"check_offset":506,"start_offset":44,"end_offset":571}}],"usage":null} 

data: [DONE]

Source: https://learn.microsoft.com/azure/ai-services/openai/concepts/content-filter?tabs=warning%2Cuser-prompt%2Cpython-new#sample-response-stream-passes-filters

I'm following up with the team to try to understand the reason for this difference. You won't see this with the older version (2023-05-15) since content filter annotations weren't added to the API until 2023-06-01-preview and later. It looks like the async filter is still in preview and could be subject to change, so at the moment I think it might be best to write code that is resilient to this API difference. You're absolutely right that the typing is wrong for Azure in this case, but I believe that this discrepancy lies more on the service than the SDK.

JensMadsen added the bug Something isn't working label Aug 26, 2024

deyaaeldeen mentioned this issue Aug 28, 2024

Cannot stream chat completions from Azure openai/openai-node#1015

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Typing: when stream is completed, delta in ChatCompletionChunk from azure openai is None; should be ChoiceDelta #1677

Typing: when stream is completed, delta in ChatCompletionChunk from azure openai is None; should be ChoiceDelta #1677

JensMadsen commented Aug 26, 2024

kristapratico commented Aug 26, 2024

JensMadsen commented Aug 27, 2024 •

edited

Loading

kristapratico commented Aug 27, 2024 •

edited

Loading

JensMadsen commented Aug 28, 2024

kristapratico commented Aug 28, 2024

Typing: when stream is completed, delta in ChatCompletionChunk from azure openai is None; should be ChoiceDelta #1677

Typing: when stream is completed, delta in ChatCompletionChunk from azure openai is None; should be ChoiceDelta #1677

Comments

JensMadsen commented Aug 26, 2024

Confirm this is an issue with the Python library and not an underlying OpenAI API

Describe the bug

To Reproduce

Code snippets

OS

Python version

Library version

kristapratico commented Aug 26, 2024

JensMadsen commented Aug 27, 2024 • edited Loading

kristapratico commented Aug 27, 2024 • edited Loading

JensMadsen commented Aug 28, 2024

kristapratico commented Aug 28, 2024

JensMadsen commented Aug 27, 2024 •

edited

Loading

kristapratico commented Aug 27, 2024 •

edited

Loading