Wrong concatenation of textwrap.dedent string when using custom system message #1128

JohanBekker · 2024-10-28T18:47:52Z

This is actually a bug report.
I am not getting good LLM Results
I have tried asking for help in the community on discord or discussions and have not received a response.
I have tried searching the documentation and have not found an answer.

What Model are you using?

gpt-3.5-turbo
gpt-4-turbo
gpt-4
Other (please specify)

Describe the bug
A clear and concise description of what the bug is.

When using a custom system message, the concatenated Instructor system message, wrapped in textwrap.dedent, has wrong indentation:

You are a helpful assistant.


        As a genius expert, your task is to understand the content and provide
        the parsed objects in json that match the following json_schema:


        {
  "properties": {
    "name": {
      "title": "Name",
      "type": "string"
    },
    "age": {
      "title": "Age",
      "type": "integer"
    }
  },
  "required": [
    "name",
    "age"
  ],
  "title": "UserInfo",
  "type": "object"
}

        Make sure to return an instance of the JSON, not the schema itself

To Reproduce

import instructor
from dotenv import load_dotenv
from langfuse.openai import OpenAI
from pydantic import BaseModel

load_dotenv()


class UserInfo(BaseModel):
    name: str
    age: int


llm = OpenAI()
client = instructor.from_openai(llm, mode=instructor.Mode.MD_JSON)

system_message = "You are a helpful assistant."
user_message = "John Doe is 30 years old."

user_info = client.chat.completions.create(
    model="gpt-4o-mini",
    max_tokens=4000,
    temperature=0,
    response_model=UserInfo,
    messages=[
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_message},
    ],
)

Expected behavior
I'm not sure how much it matters for performance though but it would be nice if the prompts are all neatly formatted.

The text was updated successfully, but these errors were encountered:

JohanBekker · 2024-11-01T11:09:30Z

It's actually not just with a custom system message, but always:

[
0: {
role: "system"
content: "
        As a genius expert, your task is to understand the content and provide
        the parsed objects in json that match the following json_schema:


        {
  "properties": {
    "name": {
      "title": "Name",
      "type": "string"
    },
    "age": {
      "title": "Age",
      "type": "integer"
    }
  },
  "required": [
    "name",
    "age"
  ],
  "title": "UserInfo",
  "type": "object"
}

        Make sure to return an instance of the JSON, not the schema itself
"
}
1: {
role: "user"
content: "John Doe is 30 years old.

Return the correct JSON response within a ```json codeblock. not the JSON_SCHEMA"
}
]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong concatenation of textwrap.dedent string when using custom system message #1128

Wrong concatenation of textwrap.dedent string when using custom system message #1128

JohanBekker commented Oct 28, 2024 •

edited

Loading

JohanBekker commented Nov 1, 2024 •

edited

Loading

Wrong concatenation of textwrap.dedent string when using custom system message #1128

Wrong concatenation of textwrap.dedent string when using custom system message #1128

Comments

JohanBekker commented Oct 28, 2024 • edited Loading

JohanBekker commented Nov 1, 2024 • edited Loading

JohanBekker commented Oct 28, 2024 •

edited

Loading

JohanBekker commented Nov 1, 2024 •

edited

Loading