AzureMLOnlineEndpoint (Serverless deployment) request body format is totally wrong. #26680

Ko-Ko-Kirk · 2024-09-19T19:37:37Z

Checked other resources

I added a very descriptive title to this issue.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain_community.llms.azureml_endpoint import (
    AzureMLOnlineEndpoint,
    CustomOpenAIContentFormatter,
)

llm = AzureMLOnlineEndpoint(
    endpoint_url="https://Meta-Llama-3-1-8B-Instruct-xx.westus3.models.ai.azure.com/v1/chat/completions/",
    endpoint_api_type='serverless',
    endpoint_api_key="xx",
    content_formatter=CustomOpenAIContentFormatter()
)

response = llm.invoke("Hello")

Error Message and Stack Trace (if applicable)

Traceback (most recent call last):
File "/Users/koko/Desktop/programming/xx/aml/aml/day10.py", line 33, in
response = llm.invoke("Hello")
^^^^^^^^^^^^^^^^^^^
File "/Users/koko/Desktop/programming/xx/aml/.venv/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 391, in invoke
self.generate_prompt(
File "/Users/koko/Desktop/programming/xx/aml/.venv/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 756, in generate_prompt
return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/koko/Desktop/programming/xx/aml/.venv/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 950, in generate
output = self._generate_helper(
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/koko/Desktop/programming/xx/aml/.venv/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 793, in _generate_helper
raise e
File "/Users/koko/Desktop/programming/xx/aml/.venv/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 780, in _generate_helper
self._generate(
File "/Users/koko/Desktop/programming/xx/aml/.venv/lib/python3.11/site-packages/langchain_community/llms/azureml_endpoint.py", line 544, in _generate
response_payload = self.http_client.call(
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/koko/Desktop/programming/xx/aml/.venv/lib/python3.11/site-packages/langchain_community/llms/azureml_endpoint.py", line 57, in call
response = urllib.request.urlopen(
^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 216, in urlopen
return opener.open(url, data, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 525, in open
response = meth(req, response)
^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 634, in http_response
response = self.parent.error(
^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 563, in error
return self._call_chain(*args)
^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 496, in _call_chain
result = func(*args)
^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 643, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

Description

I am trying to use AzureMLOnlineEndpoint with serverless deployment(llama 3.1 8B instruct), I expect it run successfully, but I got bad request 400. I use cURL to request, and I got the answer successfully. Here is my cURL:

curl -X POST https://meta-llama-3-1-8b-instruct-xx.westus3.models.ai.azure.com/v1/chat/completions \
-H "Authorization: Bearer xx" \
-H "Content-Type: application/json" \
-d '{
    "messages": [
        {
            "role": "user",
            "content": "Hello"
        }
    ]
}'

I traced the code in azureml_endpoint.py, and I found the request body is totally wrong. In serverless api version, it will send request like {"prompt": "Hello"}, which is not match the format above. You can check the code in azureml_endpoint.py around line 294-295.

By the way the format_response_payload is wrong as well. The response via cURL is:

{
    "choices":[
        {
            "finish_reason":"stop",
            "index":0,
            "message":{
                "content":"Hello! How can I assist you today?",
                "role":"assistant",
                "tool_calls":[
                    
                ]
            }
        }
    ],
    "created":1726774446,
    "id":"cmpl-e3c1ed3f284f4e988ee024b9ab73bf5d",
    "model":"Meta-Llama-3.1-8B-Instruct",
    "object":"chat.completion",
    "usage":{
        "completion_tokens":10,
        "prompt_tokens":11,
        "total_tokens":21
    }
}

There is no "text" column in the response json, but in azureml_endpoint.py around line 326-327, it parses "text" column.

System Info

System Information

OS: Darwin
OS Version: Darwin Kernel Version 23.6.0: Mon Jul 29 21:13:00 PDT 2024; root:xnu-10063.141.2~1/RELEASE_X86_64
Python Version: 3.11.5 (v3.11.5:cce6ba91b3, Aug 24 2023, 10:50:31) [Clang 13.0.0 (clang-1300.0.29.30)]

Package Information

langchain_core: 0.3.1
langchain: 0.3.0
langchain_community: 0.3.0
langsmith: 0.1.123
langchain_text_splitters: 0.3.0

Optional packages not installed

langgraph
langserve

Other Dependencies

aiohttp: 3.10.5
async-timeout: Installed. No version info available.
dataclasses-json: 0.6.7
httpx: 0.27.2
jsonpatch: 1.33
numpy: 1.26.4
orjson: 3.10.7
packaging: 24.1
pydantic: 2.9.2
pydantic-settings: 2.5.2
PyYAML: 6.0.2
requests: 2.32.3
SQLAlchemy: 2.0.35
tenacity: 8.5.0
typing-extensions: 4.12.2

The text was updated successfully, but these errors were encountered:

Ko-Ko-Kirk · 2024-09-19T20:44:43Z

I fixed it and create a PR here: #26683

kentmor · 2024-09-24T22:18:37Z

Any updates on this? I'm having the same issue.

Ko-Ko-Kirk · 2024-09-25T00:55:13Z

Any updates on this? I'm having the same issue.

You can use my PR. e.g. in pyproject.toml, modify it to langchain-community = { git = "https://github.com/Ko-Ko-Kirk/langchain.git", subdirectory = "libs/community", branch = "fix/serverless-api-400-bug" }

langcarl bot added the investigate label Sep 19, 2024

Ko-Ko-Kirk mentioned this issue Sep 19, 2024

Fix Azure ML serverless API request and response formatting in AzureMLOnlineEndpoint to resolve HTTP 400 errors #26683

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AzureMLOnlineEndpoint (Serverless deployment) request body format is totally wrong. #26680

AzureMLOnlineEndpoint (Serverless deployment) request body format is totally wrong. #26680

Ko-Ko-Kirk commented Sep 19, 2024 •

edited

Loading

Ko-Ko-Kirk commented Sep 19, 2024

kentmor commented Sep 24, 2024

Ko-Ko-Kirk commented Sep 25, 2024

AzureMLOnlineEndpoint (Serverless deployment) request body format is totally wrong. #26680

AzureMLOnlineEndpoint (Serverless deployment) request body format is totally wrong. #26680

Comments

Ko-Ko-Kirk commented Sep 19, 2024 • edited Loading

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

System Information

Package Information

Optional packages not installed

Other Dependencies

Ko-Ko-Kirk commented Sep 19, 2024

kentmor commented Sep 24, 2024

Ko-Ko-Kirk commented Sep 25, 2024

Ko-Ko-Kirk commented Sep 19, 2024 •

edited

Loading