Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example with openAI client and locally deployed LLM returns "Failed to deserialize the JSON body into the target type: tool_choice: data did not match any variant of untagged enum ToolTypeDeserializer" #1142

Open
5 of 8 tasks
paguilomanas opened this issue Nov 4, 2024 · 2 comments

Comments

@paguilomanas
Copy link

paguilomanas commented Nov 4, 2024

  • This is actually a bug report.
  • I am not getting good LLM Results
  • I have tried asking for help in the community on discord or discussions and have not received a response.
  • I have tried searching the documentation and have not found an answer.

What Model are you using?

  • gpt-3.5-turbo
  • gpt-4-turbo
  • gpt-4
  • Other -> "Llama-3.1-70B-Instruct" locally deployed with TGI

Describe the bug
I am trying the very first example of Instructor library but with a locally deployed model endpoint. I get the following error: InstructorRetryException: Failed to deserialize the JSON body into the target type: tool_choice: data did not match any variant of untagged enum ToolTypeDeserializer at line 1 column 161

To Reproduce

python 3.10
instructor 1.6.3

import openai
import instructor
from pydantic import BaseModel


class User(BaseModel):
    name: str
    age: int


instructor_client = instructor.from_openai(openai.OpenAI(
                                base_url="http://10.10.78.13:8080/v1",
                                api_key="unused",
))

user = instructor_client.chat.completions.create(
    model="Llama-3.1-70B-Instruct",
    messages=[
        {"role": "user", "content": "Create a user"},
    ],
    response_model=User,
)

print(user.name)
print(user.age)

Expected behavior
I would expect the user to be a valid BaseModel json schema but the instructor_client.chat.completions.create() call is not working.

Other details

  • My local model endpoint works perfectly when calling it using the openAI client like this:
client = openai.OpenAI(
            base_url="http://10.10.78.13:8080/v1",
            api_key="unused"
            )
response = self.client.chat.completions.create(
               model="Llama-3.1-70B-Instruct",
               messages=[{"role": "user", "content": "Create a user"}])

print(response.choices[0].message.content)

  • If the problem is related with my locally deployed model, can someone show me how to use instructor library with a locally deployed model? The final objective for it to work is to apply it when using deepeval framework.
@geokanaan
Copy link

did you manage to solve this issue?

@paguilomanas
Copy link
Author

paguilomanas commented Jan 14, 2025

I didn't manage to work instructor-ai structured output. However, if you are trying to force your locally deployed LLM output to be json structured, what worked for me was the following:

  • If the model is deployed with TGI:
from huggingface_hub import InferenceClient
from pydantic import BaseModel
import json

class User(BaseModel):
    name: str
    age: int

endpoint="http://<host>:<port>"

client = InferenceClient(endpoint)

json_schema = schema.model_json_schema()
        
response = client.text_generation(
                    prompt="Your prompt",
                    **grammar={"type": "json", "value": json_schema}**
                )
 # If you want a BaseModel not string output   
response_json = json.loads(response)
parsed_response = schema.model_validate(response_json)

  • If the model is deployed with vLLM:
from openai import OpenAI 
from pydantic import BaseModel
import json

class User(BaseModel):
    name: str
    age: int

endpoint="http://<host>:<port>"

client = OpenAI(
            base_url=endpoint,
            api_key="unused"
            )

json_schema = schema.model_json_schema()
        
response = self.client.chat.completions.create(
    model=self.model_name,
    messages=[{"role": "user", "content": prompt}],
    temperature=0.1,
    **extra_body={"guided_json": json_schema, "min_tokens": 10}**
)

response_text = response.choices[0].message.content

 # If you want a BaseModel not string output   
response_json = json.loads(response_text)
parsed_response = schema.model_validate(response_json)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants