Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Removal of mii.pydantic_v1 broke entrypoint scripts #543

Open
KMouratidis opened this issue Nov 11, 2024 · 3 comments
Open

Bug: Removal of mii.pydantic_v1 broke entrypoint scripts #543

KMouratidis opened this issue Nov 11, 2024 · 3 comments
Assignees

Comments

@KMouratidis
Copy link

When trying to run either of the entrypoint scripts, the following line in the mii/entrypoints/data_models.py file causes an import error:

from mii.pydantic_v1 import BaseModel, BaseSettings, Field

Restoring the file fixes the issue, but I guess that's not a solution. Another option is to:

  1. Install pydantic-settings and change that line to:
from pydantic import BaseModel, Field, ConfigDict
from pydantic_settings import BaseSettings
  1. Since pydantic now has the model_ namespace protected, about 10 classes need to have model_config = ConfigDict(protected_namespaces=()) (or similar) added to them, namely the ChatCompletion*, TokenCheckRequestItem, Embeddings*,Completion*, and AppSettings.
  2. AppSettings.model_id needs to marked as Optional[str] since it is not passed on initialization (see here and here)

I only tested with the openai_api_server.py script.

@KMouratidis
Copy link
Author

Restoring the file does not work because pydantic.v1.BaseModel causes the server to throw 422 errors. I guess the only option is to update the file 😕

@loadams
Copy link
Contributor

loadams commented Nov 19, 2024

Hi @KMouratidis - what DeepSpeed/Deepspeed-MII version are you using? Can you share those, since we've updated to have full Pdantic 2 support for a while.

@loadams loadams self-assigned this Nov 19, 2024
@KMouratidis
Copy link
Author

I'm using version 0.3.1 for deepspeed-mii (installed manually via pip install "deepspeed-mii==0.3.1"), and deepspeed 0.15.4 is automatically installed as a dependency. The pydantic version is 2.9.2, and python is 3.10.12.

Here are the changes that worked for my docker image:

  1. Install pydantic_settings
  2. Make these changes to data_models.py
...
from pydantic import BaseModel, Field
from pydantic_settings import BaseSettings  # <--- split imports and use pydantic_settings

...

class AppSettings(BaseSettings):
    model_id: Optional[str] = None  # <--- needs to be made Optional
    api_keys: Optional[List[str]] = None
    deployment_name: str = "deepspeed-mii"
    response_role: Optional[str] = "assistant"  # <--- had typo
Complete working file, in case I missed something
# Copyright (c) Microsoft Corporation.
# SPDX-License-Identifier: Apache-2.0

# DeepSpeed Team

## Adapted from https://github.com/lm-sys/FastChat/blob/af4dfe3f0ed481700265914af61b86e0856ac2d9/fastchat/protocol/openai_api_protocol.py
from typing import Literal, Optional, List, Dict, Any, Union

import time

import shortuuid
from pydantic import BaseModel, Field
from pydantic_settings import BaseSettings


class ErrorResponse(BaseModel):
    object: str = "error"
    message: str
    code: int


class ModelPermission(BaseModel):
    id: str = Field(default_factory=lambda: f"modelperm-{shortuuid.random()}")
    object: str = "model_permission"
    created: int = Field(default_factory=lambda: int(time.time()))
    allow_create_engine: bool = False
    allow_sampling: bool = True
    allow_logprobs: bool = True
    allow_search_indices: bool = True
    allow_view: bool = True
    allow_fine_tuning: bool = False
    organization: str = "*"
    group: Optional[str] = None
    is_blocking: str = False


class ModelCard(BaseModel):
    id: str
    object: str = "model"
    created: int = Field(default_factory=lambda: int(time.time()))
    owned_by: str = "deepspeed-mii"
    root: Optional[str] = None
    parent: Optional[str] = None
    permission: List[ModelPermission] = []


class ModelList(BaseModel):
    object: str = "list"
    data: List[ModelCard] = []


class UsageInfo(BaseModel):
    prompt_tokens: int = 0
    total_tokens: int = 0
    completion_tokens: Optional[int] = 0


class LogProbs(BaseModel):
    text_offset: List[int] = Field(default_factory=list)
    token_logprobs: List[Optional[float]] = Field(default_factory=list)
    tokens: List[str] = Field(default_factory=list)
    top_logprobs: List[Optional[Dict[str, float]]] = Field(default_factory=list)


class ChatCompletionRequest(BaseModel):
    model: str
    messages: Union[str, List[Dict[str, str]]]
    temperature: Optional[float] = 0.7
    top_p: Optional[float] = 0.9
    top_k: Optional[int] = None
    n: Optional[int] = 1
    min_tokens: Optional[int] = None
    max_tokens: Optional[int] = 128
    stop: Optional[Union[str, List[str]]] = None
    stream: Optional[bool] = False
    add_generation_prompt: Optional[bool] = True
    presence_penalty: Optional[float] = 0.0
    frequency_penalty: Optional[float] = 0.0
    user: Optional[str] = None


class ChatMessage(BaseModel):
    role: str
    content: str


class ChatCompletionResponseChoice(BaseModel):
    index: int
    message: ChatMessage
    finish_reason: Optional[Literal["stop", "length"]] = None


class ChatCompletionResponse(BaseModel):
    id: str = Field(default_factory=lambda: f"chatcmpl-{shortuuid.random()}")
    object: str = "chat.completion"
    created: int = Field(default_factory=lambda: int(time.time()))
    model: str
    choices: List[ChatCompletionResponseChoice]
    usage: UsageInfo


class DeltaMessage(BaseModel):
    role: Optional[str] = None
    content: Optional[str] = None


class ChatCompletionResponseStreamChoice(BaseModel):
    index: int
    delta: DeltaMessage
    finish_reason: Optional[Literal["stop", "length"]] = None


class ChatCompletionStreamResponse(BaseModel):
    id: str = Field(default_factory=lambda: f"chatcmpl-{shortuuid.random()}")
    object: str = "chat.completion.chunk"
    created: int = Field(default_factory=lambda: int(time.time()))
    model: str
    choices: List[ChatCompletionResponseStreamChoice]


class TokenCheckRequestItem(BaseModel):
    model: str
    prompt: str
    max_tokens: int


class TokenCheckRequest(BaseModel):
    prompts: List[TokenCheckRequestItem]


class TokenCheckResponseItem(BaseModel):
    fits: bool
    tokenCount: int
    contextLength: int


class TokenCheckResponse(BaseModel):
    prompts: List[TokenCheckResponseItem]


class EmbeddingsRequest(BaseModel):
    model: Optional[str] = None
    engine: Optional[str] = None
    input: Union[str, List[Any]]
    user: Optional[str] = None
    encoding_format: Optional[str] = None


class EmbeddingsResponse(BaseModel):
    object: str = "list"
    data: List[Dict[str, Any]]
    model: str
    usage: UsageInfo


class CompletionRequest(BaseModel):
    model: Optional[str] = None
    prompt: Union[str, List[Any]]
    max_length: Optional[int] = 32768
    suffix: Optional[str] = None
    temperature: Optional[float] = 0.7
    n: Optional[int] = 1
    min_tokens: Optional[int] = None
    max_tokens: Optional[int] = 128
    stop: Optional[Union[str, List[str]]] = None
    stream: Optional[bool] = False
    top_p: Optional[float] = 0.9
    top_k: Optional[int] = None
    logprobs: Optional[int] = None
    echo: Optional[bool] = False
    presence_penalty: Optional[float] = 0.0
    frequency_penalty: Optional[float] = 0.0
    user: Optional[str] = None
    use_beam_search: Optional[bool] = False
    best_of: Optional[int] = None


class CompletionResponseChoice(BaseModel):
    index: int
    text: str
    logprobs: Optional[LogProbs] = None
    finish_reason: Optional[Literal["stop", "length"]] = None


class CompletionResponse(BaseModel):
    id: str = Field(default_factory=lambda: f"cmpl-{shortuuid.random()}")
    object: str = "text_completion"
    created: int = Field(default_factory=lambda: int(time.time()))
    model: str
    choices: List[CompletionResponseChoice]
    usage: UsageInfo


class CompletionResponseStreamChoice(BaseModel):
    index: int
    text: str
    logprobs: Optional[LogProbs] = None
    finish_reason: Optional[Literal["stop", "length"]] = None


class CompletionStreamResponse(BaseModel):
    id: str = Field(default_factory=lambda: f"cmpl-{shortuuid.random()}")
    object: str = "text_completion"
    created: int = Field(default_factory=lambda: int(time.time()))
    model: str
    choices: List[CompletionResponseStreamChoice]


class AppSettings(BaseSettings):
    model_id: Optional[str] = None
    api_keys: Optional[List[str]] = None
    deployment_name: str = "deepspeed-mii"
    response_role: Optional[str] = "assistant"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants