Installation Guide

Overview

Provides a hosted MLflow Tracking Server meant for deployment into an Anaconda Enterprise environment.

Configuration

The below variables control the configuration of the tracking server and related components. See (Deployment) for information on when each is needed and created.

These should be defined as AE5 secrets within the service account running the tracking server. Alternatively they can also be set within the anaconda-project.yml project files.

Variables

MLFLOW_BACKEND_STORE_URI

Description
- SQLAlchemy compliant connection string
Details
- The backend store URI will most likely contain credentials for the connection and should not be exposed within anaconda-project.yml as plain text.
- For additional details see: SQLAlchemy - Engine Configuration and MLflow - Backend Stores.
Default
- sqlite:///data/mlflow/standalone/store/mydb.sqlite
MLFLOW_ARTIFACTS_DESTINATION

Description
- Artifact location storage URI
Details
- The artifact destination may not be sensitive and can be set as an ae5 secret for ease of configuration, or directly within the anaconda-project.yml for the tracking server project.
- See MLflow Artifact Stores for supported stores (this can be as simple as a locally mounted volume).
Default
- data/mlflow/standalone/artifacts
MLFLOW_TRACKING_URI

Description

Remote Tracking Server URI

Details
- This should be the static endpoint assigned to the private project deployment.
MLFLOW_REGISTRY_URI

Description

Model Registry URI

Details
- This should be the static endpoint assigned to the private project deployment.
MLFLOW_TRACKING_TOKEN

Description

AE5 Private Endpoint Access Token

Details
- This is used for authorization to the MLfLow services.
MLFLOW_TRACKING_GC_TTL

Description

MLfLow compliant string for the time limit

Details
- See MLflow Tracking Server Garbage Collection and GC Older Than for additional details.
Default

30d0h0m0s

Deployment

Create Dedicated Service Account
- The server SHOULD be uploaded and deployed using a dedicated service (ae5 user) account.
This allows for centralized configuration and management of the different components and isolation from other users. Many of the configuration parameters can be set as ae5 secrets, and it is recommended to do so for most if not all of them in the following steps.
Configure Python Environment

The deployment environment MUST occur within a conda environment with (at least):
```
 channels:
   - defaults
   - ae5-admin
 dependencies:
   - python=3
   - ae5-tools
```
Configure AE5 Tools

Since the project needs to run under the user account created earlier we need to ensure we deploy to this user account. This can be accomplished by either authenticating as the service account itself, or by an AE5 administrator who impersonates the service account at deployment time. See ae5-tools for additional details.
Download The Latest Release

The latest releases can be found here.
[Optional] Slip Stream Customizations

The default works for most scenarios. However, the archive can be updated and repackaged for specific deployments if needed. This could be useful in scenarios where changes to dependency versions, client specific commands, or default variables must occur prior to deployment.
Upload Project

This can be accomplished using ae5 tools.

Example

ae5 project upload --name "mlflow.tracking.server" mlflow.tracking.server.x.y.z.tar.gz
[Optional] Disaster Recovery

If performing a disaster recover then at this point the data MUST be restored to the database and filesystem before moving to step 8.
Deploy MLFlow Tracking Server
1. The below variables MUST be defined before deploying the server.
Variable

MLFLOW_BACKEND_STORE_URI

MLFLOW_ARTIFACTS_DESTINATION
1. To deploy an instance of the tracking server we deploy the project with the below options:
- The TrackingServer command MUST be invoked.
- The endpoint SHOULD be private.
- A static name for the endpoint SHOULD be specified.
This can be accomplished using ae5 tools.

Example

ae5 project deploy --name "dev.mlflow.tracking.server" --endpoint "dev-mlflow-tracking-server" --command "TrackingServer" --private "dev.mlflow.tracking.server"
Get Static URL For Endpoint

The static endpoint of the deployment is available from the deployment output. It can also be found afterwards from the UI or by using the ae5 tools.

Example

ae5 deployment info "mlflow-tracking-server"
Store Static URL

The endpoint URLs will be needed within this service (ae5 user) account by the garbage collection process (and/or other consumers) and SHOULD be set as an ae5 secret.
- If it is not set as an ae5 secret then it must be set within the anaconda-project.yml file when slip streaming (See Step 5) or by the consumer manually.
Both the tracking server and model registry should be set to the same endpoint:

Variable

MLFLOW_TRACKING_URI

MLFLOW_REGISTRY_URI
Create Private Deployment Token

If deployed using the recommended configuration, then the token will be REQUIRED to access the server API. It can be found from the UI or by using the ae5 tools.

Example

ae5 deployment token "mlflow.tracking.server"
Store Private Deployment Token
- If using the stale artifact clean up process then this MUST be stored within an ae5 secret.
- The token will be REQUIRED by all clients and users which need to access the MLflow Tracking Service API.
- The administrator of the MLflow Tracking Server MUST generate and provide the access token EVERY TIME the server is restart.
Store the value:

Variable

MLFLOW_TRACKING_TOKEN
Grant Web Access to Authenticated Users
- Authenticated users must be granted access to the web front end. To accomplish this the deployment must be shared with everyone or another approved group.
- The process of sharing deployments can be reviewed in Sharing deployments.
Create Garbage Collection Schedule

What is garbage collection?

The MLflow Tracking Server does not automatically purge resources that a client deletes. Instead, resources are set to the deleted lifecycle state and hidden from the UI and most API calls by default. A deleted resource will block the creation of a new resource of the same name until the garbage collection process has purged it. In order to purge deleted items a garbage collection process MUST be manually executed.
- The project comes with a command for invoking garbage collection. A schedule for the process SHOULD be created so that this is occurring regularly.
- These environment variables MUST be defined as ae5 secrets, within the anaconda-project.yml, or passed to the ae5 job create command as variables (see below).
Variable

MLFLOW_BACKEND_STORE_URI

MLFLOW_TRACKING_GC_TTL

Examples

ae5 job create --command "GarbageCollection" --schedule "*/10 * * * *" --name "scheduled mlflow.tracking.server garbage collection" "mlflow.tracking.server"

ae5 job create --command "GarbageCollection" --schedule "*/10 * * * *" --name "scheduled mlflow.tracking.server garbage collection" "mlflow.tracking.server" --variable MLFLOW_TRACKING_GC_TTL="0d0h10m0s" --variable MLFLOW_BACKEND_STORE_URI="sqlite:///data/mlflow/dev/store/mydb.sqlite"

Anaconda Project Runtime Commands

These commands are used to start the server and perform the various administrative tasks.

Command	Environment	Description
TrackingServer	Runtime	Launches the MLflow Tracking Server
GarbageCollection	Runtime	Launches the MLflow tracking server garbage collection process
DatabaseUpgrade	Runtime	Launches the MLflow tracking server database upgrade process

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

installation_guide.md

installation_guide.md

Installation Guide

Overview

Configuration

Variables

Deployment

Anaconda Project Runtime Commands

Variable
MLFLOW_BACKEND_STORE_URI
MLFLOW_ARTIFACTS_DESTINATION

Variable
MLFLOW_TRACKING_URI
MLFLOW_REGISTRY_URI

Variable
MLFLOW_BACKEND_STORE_URI
MLFLOW_TRACKING_GC_TTL

Files

installation_guide.md

Latest commit

History

installation_guide.md

File metadata and controls

Installation Guide

Overview

Configuration

Variables

Deployment

Anaconda Project Runtime Commands