Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AzureMLOnlineEndpoint (Serverless deployment) request body format is totally wrong. #26680

Open
5 tasks done
Ko-Ko-Kirk opened this issue Sep 19, 2024 · 3 comments
Open
5 tasks done

Comments

@Ko-Ko-Kirk
Copy link

Ko-Ko-Kirk commented Sep 19, 2024

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain_community.llms.azureml_endpoint import (
    AzureMLOnlineEndpoint,
    CustomOpenAIContentFormatter,
)

llm = AzureMLOnlineEndpoint(
    endpoint_url="https://Meta-Llama-3-1-8B-Instruct-xx.westus3.models.ai.azure.com/v1/chat/completions/",
    endpoint_api_type='serverless',
    endpoint_api_key="xx",
    content_formatter=CustomOpenAIContentFormatter()
)

response = llm.invoke("Hello")

Error Message and Stack Trace (if applicable)

Traceback (most recent call last):
File "/Users/koko/Desktop/programming/xx/aml/aml/day10.py", line 33, in
response = llm.invoke("Hello")
^^^^^^^^^^^^^^^^^^^
File "/Users/koko/Desktop/programming/xx/aml/.venv/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 391, in invoke
self.generate_prompt(
File "/Users/koko/Desktop/programming/xx/aml/.venv/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 756, in generate_prompt
return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/koko/Desktop/programming/xx/aml/.venv/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 950, in generate
output = self._generate_helper(
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/koko/Desktop/programming/xx/aml/.venv/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 793, in _generate_helper
raise e
File "/Users/koko/Desktop/programming/xx/aml/.venv/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 780, in _generate_helper
self._generate(
File "/Users/koko/Desktop/programming/xx/aml/.venv/lib/python3.11/site-packages/langchain_community/llms/azureml_endpoint.py", line 544, in _generate
response_payload = self.http_client.call(
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/koko/Desktop/programming/xx/aml/.venv/lib/python3.11/site-packages/langchain_community/llms/azureml_endpoint.py", line 57, in call
response = urllib.request.urlopen(
^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 216, in urlopen
return opener.open(url, data, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 525, in open
response = meth(req, response)
^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 634, in http_response
response = self.parent.error(
^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 563, in error
return self._call_chain(*args)
^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 496, in _call_chain
result = func(*args)
^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 643, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

Description

I am trying to use AzureMLOnlineEndpoint with serverless deployment(llama 3.1 8B instruct), I expect it run successfully, but I got bad request 400. I use cURL to request, and I got the answer successfully. Here is my cURL:

curl -X POST https://meta-llama-3-1-8b-instruct-xx.westus3.models.ai.azure.com/v1/chat/completions \
-H "Authorization: Bearer xx" \
-H "Content-Type: application/json" \
-d '{
    "messages": [
        {
            "role": "user",
            "content": "Hello"
        }
    ]
}'

I traced the code in azureml_endpoint.py, and I found the request body is totally wrong. In serverless api version, it will send request like {"prompt": "Hello"}, which is not match the format above. You can check the code in azureml_endpoint.py around line 294-295.

By the way the format_response_payload is wrong as well. The response via cURL is:

{
    "choices":[
        {
            "finish_reason":"stop",
            "index":0,
            "message":{
                "content":"Hello! How can I assist you today?",
                "role":"assistant",
                "tool_calls":[
                    
                ]
            }
        }
    ],
    "created":1726774446,
    "id":"cmpl-e3c1ed3f284f4e988ee024b9ab73bf5d",
    "model":"Meta-Llama-3.1-8B-Instruct",
    "object":"chat.completion",
    "usage":{
        "completion_tokens":10,
        "prompt_tokens":11,
        "total_tokens":21
    }
}

There is no "text" column in the response json, but in azureml_endpoint.py around line 326-327, it parses "text" column.

System Info

System Information

OS: Darwin
OS Version: Darwin Kernel Version 23.6.0: Mon Jul 29 21:13:00 PDT 2024; root:xnu-10063.141.2~1/RELEASE_X86_64
Python Version: 3.11.5 (v3.11.5:cce6ba91b3, Aug 24 2023, 10:50:31) [Clang 13.0.0 (clang-1300.0.29.30)]

Package Information

langchain_core: 0.3.1
langchain: 0.3.0
langchain_community: 0.3.0
langsmith: 0.1.123
langchain_text_splitters: 0.3.0

Optional packages not installed

langgraph
langserve

Other Dependencies

aiohttp: 3.10.5
async-timeout: Installed. No version info available.
dataclasses-json: 0.6.7
httpx: 0.27.2
jsonpatch: 1.33
numpy: 1.26.4
orjson: 3.10.7
packaging: 24.1
pydantic: 2.9.2
pydantic-settings: 2.5.2
PyYAML: 6.0.2
requests: 2.32.3
SQLAlchemy: 2.0.35
tenacity: 8.5.0
typing-extensions: 4.12.2

@Ko-Ko-Kirk
Copy link
Author

I fixed it and create a PR here: #26683

@kentmor
Copy link

kentmor commented Sep 24, 2024

Any updates on this? I'm having the same issue.

@Ko-Ko-Kirk
Copy link
Author

Any updates on this? I'm having the same issue.

You can use my PR. e.g. in pyproject.toml, modify it to langchain-community = { git = "https://github.com/Ko-Ko-Kirk/langchain.git", subdirectory = "libs/community", branch = "fix/serverless-api-400-bug" }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants