Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add o1 support with structured output #1025

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

CTY-git
Copy link
Contributor

@CTY-git CTY-git commented Nov 15, 2024

PR Checklist

  • The commit message follows our guidelines: Code of conduct
  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • Does this PR introduce a breaking change?
  • Include PR in release notes?

PR Type

  • Bugfix
  • Feature
  • Refactoring
  • Build /CI
  • Documentation
  • Others

What is the current behavior?

Currently o1 models are not supported due to outdated tiktoken version. Current tiktoken version is 0.7.0
In addition, o1 models do not support structured outputs.
Issue Number: N/A

What is the new behavior?

I have bumped our own fork of code2prompt to depend on version ^0.8.0 of tiktoken which adds encoding mapping of o1 models.

I have opted to always follow the recommendation here when structured output is required of o1 models.

Other information

@patched-codes patched-codes bot deleted a comment from patched-admin Nov 15, 2024
@CTY-git CTY-git requested a review from jonahdc November 15, 2024 07:35
Copy link

patched-codes bot commented Nov 15, 2024

File Changed: patchwork/common/client/llm/openai_.py

Details: Function name __o1_chat_completion uses double underscores, which is typically reserved for name mangling in Python. This violates the snake_case naming convention rule.

Affected Code Snippet:

def __o1_chat_completion(
    self,
    messages: Iterable[ChatCompletionMessageParam],
    model: str,
    frequency_penalty: Optional[float] | NotGiven = NOT_GIVEN,
    logit_bias: Optional[Dict[str, int]] | NotGiven = NOT_GIVEN,
    logprobs: Optional[bool] | NotGiven = NOT_GIVEN,
    max_tokens: Optional[int] | NotGiven = NOT_GIVEN,
    n: Optional[int] | NotGiven = NOT_GIVEN,
    presence_penalty: Optional[float] | NotGiven = NOT_GIVEN,
    response_format: dict | completion_create_params.ResponseFormat | NotGiven = NOT_GIVEN,
    stop: Union[Optional[str], List[str]] | NotGiven = NOT_GIVEN,
    temperature: Optional[float] | NotGiven = NOT_GIVEN,
    top_logprobs: Optional[int] | NotGiven = NOT_GIVEN,
    top_p: Optional[float] | NotGiven = NOT_GIVEN,
):

Start Line: 127
End Line: 141

Details: The function __o1_chat_completion lacks a proper docstring explaining its purpose, parameters, and return type.

Affected Code Snippet:

def __o1_chat_completion(
    self,
    messages: Iterable[ChatCompletionMessageParam],
    model: str,
    frequency_penalty: Optional[float] | NotGiven = NOT_GIVEN,
    logit_bias: Optional[Dict[str, int]] | NotGiven = NOT_GIVEN,
    logprobs: Optional[bool] | NotGiven = NOT_GIVEN,
    max_tokens: Optional[int] | NotGiven = NOT_GIVEN,
    n: Optional[int] | NotGiven = NOT_GIVEN,
    presence_penalty: Optional[float] | NotGiven = NOT_GIVEN,
    response_format: dict | completion_create_params.ResponseFormat | NotGiven = NOT_GIVEN,
    stop: Union[Optional[str], List[str]] | NotGiven = NOT_GIVEN,
    temperature: Optional[float] | NotGiven = NOT_GIVEN,
    top_logprobs: Optional[int] | NotGiven = NOT_GIVEN,
    top_p: Optional[float] | NotGiven = NOT_GIVEN,
):

Start Line: 127
End Line: 141

Details: The function __o1_chat_completion is missing a return type annotation, which violates the rule of including type hints for function return values.

Affected Code Snippet:

def __o1_chat_completion(
    self,
    messages: Iterable[ChatCompletionMessageParam],
    model: str,
    frequency_penalty: Optional[float] | NotGiven = NOT_GIVEN,
    logit_bias: Optional[Dict[str, int]] | NotGiven = NOT_GIVEN,
    logprobs: Optional[bool] | NotGiven = NOT_GIVEN,
    max_tokens: Optional[int] | NotGiven = NOT_GIVEN,
    n: Optional[int] | NotGiven = NOT_GIVEN,
    presence_penalty: Optional[float] | NotGiven = NOT_GIVEN,
    response_format: dict | completion_create_params.ResponseFormat | NotGiven = NOT_GIVEN,
    stop: Union[Optional[str], List[str]] | NotGiven = NOT_GIVEN,
    temperature: Optional[float] | NotGiven = NOT_GIVEN,
    top_logprobs: Optional[int] | NotGiven = NOT_GIVEN,
    top_p: Optional[float] | NotGiven = NOT_GIVEN,
):

Start Line: 127
End Line: 141

File Changed: patchwork/patchflows/GenerateUnitTests/GenerateUnitTests.py

  1. Violation of rule: Do not write functions without proper docstrings including parameter types, return types, and purpose

Details: The __init__ and run methods in the GenerateUnitTests class lack proper docstrings. Docstrings should include parameter types, return types, and a clear description of the method's purpose.

Affected Code Snippet:

class GenerateUnitTests(Step):
    def __init__(self, inputs):
        super().__init__(inputs)

        final_inputs = yaml.safe_load(_DEFAULT_INPUT_FILE.read_text())
        if final_inputs is None:
            final_inputs = {}

        final_inputs.update(inputs)

        # ... rest of the method ...

    def run(self):
        outputs = CallCode2Prompt(self.inputs).run()
        new_file_name = f"test_file.{self.inputs['test_file_extension']}"
        new_file_path = Path(outputs["uri"]).with_name(new_file_name)
        Path(outputs["uri"]).rename(new_file_path)
        outputs["uri"] = str(new_file_path)
        # ... rest of the method ...

Start Line: 13
End Line: 55

  1. Violation of rule: Do not omit type annotations and type hints for function parameters and return values

Details: The __init__ and run methods lack type annotations for their parameters and return values. Type hints should be added to improve code readability and maintainability.

Affected Code Snippet:

class GenerateUnitTests(Step):
    def __init__(self, inputs):
        super().__init__(inputs)

        # ... rest of the method ...

    def run(self):
        outputs = CallCode2Prompt(self.inputs).run()
        # ... rest of the method ...

Start Line: 13
End Line: 55

  1. Violation of rule: Do not use f-strings with untrusted user input

Details: The code uses f-strings to create pr_title and branch_prefix. While the current usage doesn't appear to directly include user input, it's a potential security risk if self.__class__.__name__ could be manipulated by user input in any way.

Affected Code Snippet:

final_inputs["pr_title"] = f"PatchWork Unit Tests generated"
final_inputs["branch_prefix"] = f"{self.__class__.__name__.lower()}-"

Start Line: 31
End Line: 32

  1. Violation of rule: Do not define input/output structures without using TypedDict

Details: The code uses dictionaries for input and output structures (final_inputs, outputs) without using TypedDict to define their structure. This can lead to potential runtime errors and makes the code harder to maintain.

Affected Code Snippet:

final_inputs = yaml.safe_load(_DEFAULT_INPUT_FILE.read_text())
if final_inputs is None:
    final_inputs = {}

final_inputs.update(inputs)

# ... and later in the run method ...

outputs = CallCode2Prompt(self.inputs).run()

Start Line: 16
End Line: 55

  1. Violation of rule: Do not skip using Annotated for StepTypeConfig

Details: The code doesn't use Annotated for StepTypeConfig, which is recommended for better type checking and documentation of step configurations.

Affected Code Snippet:

class GenerateUnitTests(Step):
    def __init__(self, inputs):
        super().__init__(inputs)
        # ... rest of the method ...

Start Line: 13
End Line: 55

File Changed: patchwork/patchflows/__init__.py

Details: The __all__ list in the __init__.py file is not using consistent quotation marks for the module names. All items should use double quotes for consistency.

Affected Code Snippet:

__all__ = [
    "AutoFix",
    "DependencyUpgrade",
    "GenerateREADME",
    "PRReview",
    "ResolveIssue",
    "GenerateDocstring",
    "GenerateUnitTests",
]

Start Line: 9
End Line: 17

Details: The imports in the __init__.py file are not sorted using isort. The GenerateUnitTests import is out of alphabetical order.

Affected Code Snippet:

from .AutoFix.AutoFix import AutoFix
from .DependencyUpgrade.DependencyUpgrade import DependencyUpgrade
from .GenerateDocstring.GenerateDocstring import GenerateDocstring
from .GenerateREADME.GenerateREADME import GenerateREADME
from .GenerateUnitTests.GenerateUnitTests import GenerateUnitTests
from .PRReview.PRReview import PRReview
from .ResolveIssue.ResolveIssue import ResolveIssue

Start Line: 1
End Line: 7

File Changed: patchwork/steps/ModifyCode/ModifyCode.py

Details: The code diff shows an inconsistent use of whitespace. The change removes a trailing whitespace, which is good practice, but it's not a major violation of the provided rules.

Affected Code Snippet:

             if new_code is None:
                 continue
-            
+
             replace_code_in_file(uri, start_line, end_line, new_code)
             modified_code_file = dict(path=uri, start_line=start_line, end_line=end_line, **extracted_response)
             modified_code_files.append(modified_code_file)

Start Line: 88
End Line: 93

This review is based on a thorough analysis of the provided code diff and pull request information against the given rules. The change made is actually an improvement in code style by removing unnecessary whitespace. However, this doesn't constitute a major violation of any of the specified rules.

No other clear and major violations of the provided rules were identified in the given code diff. The PR seems to focus on adding support for o1 models and structured output, which is not directly related to the code changes shown in the diff.

It's worth noting that without seeing the full context of the file, it's challenging to make a comprehensive assessment of all the rules. The diff only shows a small portion of the code, which limits the scope of the review.

File Changed: pyproject.toml

Details: The version specification for patched-code2prompt is using an imprecise development version format.

Affected Code Snippet:

patched-code2prompt = "~0.9.0.dev3"

Start Line: 36
End Line: 36

Details: The patchwork-cli package version is not following the correct development version format.

Affected Code Snippet:

[tool.poetry]
name = "patchwork-cli"
version = "0.0.79"

Start Line: 1
End Line: 3

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 6 out of 6 changed files in this pull request and generated no suggestions.

Comments skipped due to low confidence (3)

patchwork/common/client/llm/openai_.py:32

  • The line contains a tab character which should be removed.
gpt-4	

patchwork/common/client/llm/openai_.py:115

  • The line is unreachable due to the preceding return statement.
return self.client.chat.completions.create(**NotGiven.remove_not_given(input_kwargs))

patchwork/common/client/llm/openai_.py:177

  • The string concatenation should use f-string formatting for consistency.
"content": f"Given the following data, format it with the given response format: {o1_choice.message.content}"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants