meowlflow
makes it easy to deploy MLflow models as HTTP APIs powered by FastAPI.
meowlflow
allows model creators to design expressive HTTP APIs by defining the input and output schemas for their models and takes care of translating requests to MLflow's expected format.
meowlflow
also provides built-in observability for model APIs with Prometheus metrics, OpenAPI specifications for model APIs, and an opinionated model promotion workflow.
pip install .
Using meowlflow
you can serve your MLflow model with a custom schema with one command:
meowlflow serve --endpoint infer \
--model-path path/to/model \
--host 0.0.0.0 \
--port 8000 \
--schema-path path/to/schema.py
meowlflow
will make your MLflow model available at http://127.0.0.1:8000/api/v1/infer
with your custom schema.
Note: the
--model-path
flag can be either a URI referencing a local file path or URI pointing at a model on a remote artifact store.
You can then make an HTTP request to send samples for scoring. For example, if your schema defines a request as a list of strings:
curl http://127.0.0.1:8000/api/v1/infer -H "Content-Type: application/json" -d '["meow", "meowv2"]'
Thanks to some FastAPI magic, documentation for the model's API is automatically generated and available at http://127.0.0.1:8000/docs
for all models that are served with meowlflow serve
.
Alternatively, you can use meowlflow sidecar
to provide an expressive API on top of your existing MLflow model deployment.
This meowlflow
proxy allows you to upgrade a legacy model served with mlflow models serve
so that it can receive HTTP requests with an API that is easier to use.
For example: you deploy an MLflow model receiving inputs at http://127.0.0.1:5000/invocations
.
Using meowlflow sidecar
you can then serve a proxy API listening on port 8000
supporting your custom schema by running:
meowlflow sidecar --endpoint infer \
--upstream http://127.0.0.1:5000/invocations \
--host 0.0.0.0 \
--port 8000 \
--schema-path path/to/schema.py
You can then make an HTTP request to send samples for scoring. For example, if your schema defines a request as a list of strings:
curl http://127.0.0.1:8000/api/v1/infer -H "Content-Type: application/json" -d '["meow", "meowv2"]'
Just as with the meowlflow serve
command, documentation for the model's API is automatically generated and available at http://127.0.0.1:8000/docs
.
The meowlflow openapi
command outputs an OpenAPI v3 schema in JSON format that fully describes the HTTP API of a model.
This automatically-generated API schema allows you to generate complete clients in any programming language to interact with a model.
Generating clients in this manner means that you can avoid having to manually write clients for any software that needs to use a model and can focus instead on business logic.
For example, if you needed to generate a Python package for a service that makes requests to your model, you could use the openapi-python-client package:
openapi-python-client generate --path <(meowlflow openapi --model-path path/to/model)
A core concept in meowlflow
is the model schema.
Model schemas are used to define the shape of requests and responses for your model's API.
Defining a schema done by creating a schema.py
file containing both a Request
and a Response
class and placing the file somewhere meowlflow
can read it, for instance at /var/lib/meowlflow/schema.py
.
The Request
class must implement a transform
method to format the payload in a shape that can be used by your MLflow model.
The Response
class should implement a transform
class method to convert the model output to your desired response shape.
For example, you could use the following custom schema for an API for a model that predicts document boundaries:
from typing import Any, List
from pydantic import Field
from meowlflow.api.base import (
BaseRequest,
BaseResponse,
)
description = (
"A model that predicts document boundaries, where the length of the prediction "
"array is equal to the number of pages in the input, a '0' at a given index means "
"the page belongs to the current document, and a '1' marks the start of a new "
"document on the given index."
)
title = "Document Splitter"
version = "0.1.0"
class Request(BaseRequest):
__root__: List[str] = Field(..., min_items=1)
def transform(self) -> Any:
return self.__root__
class Config:
schema_extra = {
"example": [
"page 1 of 2\nfoo",
"page 2 of 2\nbar",
"page 1 of 1\nbaz",
]
}
class Response(BaseResponse):
predictions: List[int]
@classmethod
def transform(cls, data: Any) -> Any:
return {"predictions": data}
class Config:
schema_extra = {"example": {"predictions": [1, 0, 1]}}
The easiest way to develop and fine-tune a schema and API for your model is to:
- use the
meowlflow serve
command with the--model-path
flag set to a remote URI, e.g.s3://mlflow/prod/artifacts/2/08c...a85/artifacts/model
; - open
http://127.0.0.1:8000/docs
, or wherevermeowlflow
is running, in a browser; and - use the
Try it out
feature of the OpenAPI documentation to send HTTP requests to your model directly from the browser.