Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently MLServer can't deal well with models that are loaded on the same inference pool with other models that are heavy used. In this case there is a risk of starvation and we want to allow the user to be able to create models on separate processes (different inference pool).
MLServer does this but only if the model uses a specific custom environment tarball.
Proposed solution: Introduce
inference_pool_gid
to theModelParameters
class. In this way, we give the user the option to add the model to a dedicated inference pool group. Note that this allows to add either a single model or multiple models to the same group. I also included theautogenerate_inference_pool_gid
boolean flag. When set toTrue
and noinference_pool_gid
is provided, a gid is generated usinguuid4
. This will add the model to a dedicated group - this way the user does not have to specify the gid themself.