-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Removal of mii.pydantic_v1 broke entrypoint scripts #543
Comments
Restoring the file does not work because |
Hi @KMouratidis - what DeepSpeed/Deepspeed-MII version are you using? Can you share those, since we've updated to have full Pdantic 2 support for a while. |
I'm using version Here are the changes that worked for my docker image:
...
from pydantic import BaseModel, Field
from pydantic_settings import BaseSettings # <--- split imports and use pydantic_settings
...
class AppSettings(BaseSettings):
model_id: Optional[str] = None # <--- needs to be made Optional
api_keys: Optional[List[str]] = None
deployment_name: str = "deepspeed-mii"
response_role: Optional[str] = "assistant" # <--- had typo Complete working file, in case I missed something# Copyright (c) Microsoft Corporation.
# SPDX-License-Identifier: Apache-2.0
# DeepSpeed Team
## Adapted from https://github.com/lm-sys/FastChat/blob/af4dfe3f0ed481700265914af61b86e0856ac2d9/fastchat/protocol/openai_api_protocol.py
from typing import Literal, Optional, List, Dict, Any, Union
import time
import shortuuid
from pydantic import BaseModel, Field
from pydantic_settings import BaseSettings
class ErrorResponse(BaseModel):
object: str = "error"
message: str
code: int
class ModelPermission(BaseModel):
id: str = Field(default_factory=lambda: f"modelperm-{shortuuid.random()}")
object: str = "model_permission"
created: int = Field(default_factory=lambda: int(time.time()))
allow_create_engine: bool = False
allow_sampling: bool = True
allow_logprobs: bool = True
allow_search_indices: bool = True
allow_view: bool = True
allow_fine_tuning: bool = False
organization: str = "*"
group: Optional[str] = None
is_blocking: str = False
class ModelCard(BaseModel):
id: str
object: str = "model"
created: int = Field(default_factory=lambda: int(time.time()))
owned_by: str = "deepspeed-mii"
root: Optional[str] = None
parent: Optional[str] = None
permission: List[ModelPermission] = []
class ModelList(BaseModel):
object: str = "list"
data: List[ModelCard] = []
class UsageInfo(BaseModel):
prompt_tokens: int = 0
total_tokens: int = 0
completion_tokens: Optional[int] = 0
class LogProbs(BaseModel):
text_offset: List[int] = Field(default_factory=list)
token_logprobs: List[Optional[float]] = Field(default_factory=list)
tokens: List[str] = Field(default_factory=list)
top_logprobs: List[Optional[Dict[str, float]]] = Field(default_factory=list)
class ChatCompletionRequest(BaseModel):
model: str
messages: Union[str, List[Dict[str, str]]]
temperature: Optional[float] = 0.7
top_p: Optional[float] = 0.9
top_k: Optional[int] = None
n: Optional[int] = 1
min_tokens: Optional[int] = None
max_tokens: Optional[int] = 128
stop: Optional[Union[str, List[str]]] = None
stream: Optional[bool] = False
add_generation_prompt: Optional[bool] = True
presence_penalty: Optional[float] = 0.0
frequency_penalty: Optional[float] = 0.0
user: Optional[str] = None
class ChatMessage(BaseModel):
role: str
content: str
class ChatCompletionResponseChoice(BaseModel):
index: int
message: ChatMessage
finish_reason: Optional[Literal["stop", "length"]] = None
class ChatCompletionResponse(BaseModel):
id: str = Field(default_factory=lambda: f"chatcmpl-{shortuuid.random()}")
object: str = "chat.completion"
created: int = Field(default_factory=lambda: int(time.time()))
model: str
choices: List[ChatCompletionResponseChoice]
usage: UsageInfo
class DeltaMessage(BaseModel):
role: Optional[str] = None
content: Optional[str] = None
class ChatCompletionResponseStreamChoice(BaseModel):
index: int
delta: DeltaMessage
finish_reason: Optional[Literal["stop", "length"]] = None
class ChatCompletionStreamResponse(BaseModel):
id: str = Field(default_factory=lambda: f"chatcmpl-{shortuuid.random()}")
object: str = "chat.completion.chunk"
created: int = Field(default_factory=lambda: int(time.time()))
model: str
choices: List[ChatCompletionResponseStreamChoice]
class TokenCheckRequestItem(BaseModel):
model: str
prompt: str
max_tokens: int
class TokenCheckRequest(BaseModel):
prompts: List[TokenCheckRequestItem]
class TokenCheckResponseItem(BaseModel):
fits: bool
tokenCount: int
contextLength: int
class TokenCheckResponse(BaseModel):
prompts: List[TokenCheckResponseItem]
class EmbeddingsRequest(BaseModel):
model: Optional[str] = None
engine: Optional[str] = None
input: Union[str, List[Any]]
user: Optional[str] = None
encoding_format: Optional[str] = None
class EmbeddingsResponse(BaseModel):
object: str = "list"
data: List[Dict[str, Any]]
model: str
usage: UsageInfo
class CompletionRequest(BaseModel):
model: Optional[str] = None
prompt: Union[str, List[Any]]
max_length: Optional[int] = 32768
suffix: Optional[str] = None
temperature: Optional[float] = 0.7
n: Optional[int] = 1
min_tokens: Optional[int] = None
max_tokens: Optional[int] = 128
stop: Optional[Union[str, List[str]]] = None
stream: Optional[bool] = False
top_p: Optional[float] = 0.9
top_k: Optional[int] = None
logprobs: Optional[int] = None
echo: Optional[bool] = False
presence_penalty: Optional[float] = 0.0
frequency_penalty: Optional[float] = 0.0
user: Optional[str] = None
use_beam_search: Optional[bool] = False
best_of: Optional[int] = None
class CompletionResponseChoice(BaseModel):
index: int
text: str
logprobs: Optional[LogProbs] = None
finish_reason: Optional[Literal["stop", "length"]] = None
class CompletionResponse(BaseModel):
id: str = Field(default_factory=lambda: f"cmpl-{shortuuid.random()}")
object: str = "text_completion"
created: int = Field(default_factory=lambda: int(time.time()))
model: str
choices: List[CompletionResponseChoice]
usage: UsageInfo
class CompletionResponseStreamChoice(BaseModel):
index: int
text: str
logprobs: Optional[LogProbs] = None
finish_reason: Optional[Literal["stop", "length"]] = None
class CompletionStreamResponse(BaseModel):
id: str = Field(default_factory=lambda: f"cmpl-{shortuuid.random()}")
object: str = "text_completion"
created: int = Field(default_factory=lambda: int(time.time()))
model: str
choices: List[CompletionResponseStreamChoice]
class AppSettings(BaseSettings):
model_id: Optional[str] = None
api_keys: Optional[List[str]] = None
deployment_name: str = "deepspeed-mii"
response_role: Optional[str] = "assistant" |
When trying to run either of the entrypoint scripts, the following line in the mii/entrypoints/data_models.py file causes an import error:
Restoring the file fixes the issue, but I guess that's not a solution. Another option is to:
pydantic-settings
and change that line to:model_
namespace protected, about 10 classes need to havemodel_config = ConfigDict(protected_namespaces=())
(or similar) added to them, namely theChatCompletion*
,TokenCheckRequestItem
,Embeddings*
,Completion*
, andAppSettings
.AppSettings.model_id
needs to marked asOptional[str]
since it is not passed on initialization (see here and here)I only tested with the
openai_api_server.py
script.The text was updated successfully, but these errors were encountered: