-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[backend] Expose suggested models #419
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
c8d49c9
to
dca4861
Compare
0d76054
to
562fe31
Compare
veekaybee
reviewed
Nov 21, 2024
veekaybee
reviewed
Nov 21, 2024
veekaybee
reviewed
Nov 21, 2024
1784aa6
to
dad88a2
Compare
veekaybee
reviewed
Nov 21, 2024
lumigator/python/mzai/backend/backend/tests/api/routes/test_models.py
Outdated
Show resolved
Hide resolved
lumigator/python/mzai/backend/backend/tests/api/routes/test_models.py
Outdated
Show resolved
Hide resolved
lumigator/python/mzai/backend/backend/tests/api/routes/test_models.py
Outdated
Show resolved
Hide resolved
ividal
reviewed
Nov 21, 2024
ividal
reviewed
Nov 21, 2024
dad88a2
to
01d488a
Compare
a28f6e3
to
e27b724
Compare
veekaybee
reviewed
Nov 22, 2024
veekaybee
approved these changes
Nov 22, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good after the final round of addressed changes, thanks for adding this!
a1426d8
to
12da903
Compare
Introduce a new `/models` endpoint that returns a list of suggested models for a given task. Currently, the only supported task is that of summarization. Refs #381 Signed-off-by: Dimitris Poulopoulos <[email protected]>
Move the suggested models list in a YAML file. Read the YAML file during runtime to respond appropriately. Refs #381 Signed-off-by: Dimitris Poulopoulos <[email protected]>
GPT-4-Turbo is a legacy OpenAI model. Replace it with GPT-4o, which is the latest iteration of OpenAI's flagship model. Signed-off-by: Dimitris Poulopoulos <[email protected]>
Extend the models YAML file to include model-specific information. Add model-specific details, such as the number of parameters, memory footprint, and tensor type (e.g., FP32, BF16, etc.), along with a set of default values used by Lumigator when calling the models. Not all parameters are applicable to each model. For example, Hugging Face models have default values for parameters like `max_length`, while API models include parameters like `temperature`. In general, parameters are model-specific, so there is no universal set of parameter names. The defaults are inferred either by checking the documentation for each model or by examining the configuration file on the Hugging Face Hub. Refs #381 Signed-off-by: Dimitris Poulopoulos <[email protected]>
Use the models YAML file to retrieve the list of supported tasks, instead of hardcoding it in the code. Refs #381 Signed-off-by: Dimitris Poulopoulos <[email protected]>
Validate the response of the ``/models`` endpoint using Pydantic. For each model, there are a few required fields: * name * uri * description There are also some optional fields, since not all models have available information (e.g., we don't have the model size for OpenAI models) or default parameters. Finally, the default parameters dictionary is a generic `dict`, as there is no guaranteed uniformity. Refs #381 Signed-off-by: Dimitris Poulopoulos <[email protected]>
Signed-off-by: Dimitris Poulopoulos <[email protected]>
Retrieve the suggested models per task via the Lumigator SDK client. Refs #381 Signed-off-by: Dimitris Poulopoulos <[email protected]>
Signed-off-by: Dimitris Poulopoulos <[email protected]>
Add a page to the documentation listing the suggested models and their details, along with instructions on how to retrieve them using the API. Closes #381 Signed-off-by: Dimitris Poulopoulos <[email protected]>
Signed-off-by: Dimitris Poulopoulos <[email protected]>
12da903
to
8861585
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
api
Changes which impact API/presentation layer
backend
documentation
Improvements or additions to documentation
enhancement
New feature or request
schemas
Changes to schemas (which may be public facing)
sdk
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What's changing
This PR introduces a new
/models
endpoint that returns a list of suggested models for a given task. Currently, the only supported task is "summarization."The suggested models are stored in a YAML file, along with details such as number of parameters, memory footprint, and default values. Not all information is available for every model. For example, OpenAI models do not provide memory footprint data, while other models may not include summarization-specific default parameters.
Additionally, this PR adds a new SDK function for calling the
/models
endpoint via Python, along with the corresponding documentation.How to test it
make local-up
curl -s http://localhost:8000/api/v1/models/summarization | jq
Additional notes for reviewers
If you request the suggested models for a task that Lumigator doesn't support (e.g.,
curl -s http://localhost:8000/api/v1/models/translation | jq
), you should get an HTTP 400 error:I already...
Closes #381