Skip to content

Commit

Permalink
MLFlow Integration with Data Flow (#57)
Browse files Browse the repository at this point in the history
* Commit

* Added the modules

* Create Readme.md

* Dataflow integration

* Update README.md

* Update mlflow.sh

Removing Private URIs

* Update image.sh

Removed Private URIs

* Update mysql.sh

Removed Private URIs

* Update Readme.md

Updated the command to start mlflow tracking server
  • Loading branch information
nilayp2107 authored Jul 26, 2023
1 parent 8eef17f commit 4234e28
Show file tree
Hide file tree
Showing 9 changed files with 253 additions and 0 deletions.
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,10 @@ For step-by-step instructions, see the README files included with each sample.

These samples show how to use the OCI Data Flow service and are meant to be deployed to and run from Oracle Cloud. You can optionally test these applications locally before you deploy them. When they are ready, you can deploy them to Data Flow without any need to reconfigure them, make code changes, or apply deployment profiles.To test these applications locally, Apache Spark needs to be installed. Refer to section on how to set the Prerequisites before you deploy the application locally [Setup locally](https://docs.oracle.com/en-us/iaas/data-flow/data-flow-tutorial/develop-apps-locally/front.htm).

### MLFlow Tracking Server

Set up MLFlow Tracking Server: Refer to this section [dataflow-mlflow-integration](https://github.com/nilayp2107/oracle-dataflow-samples/dataflow-mlflow-integration)

## Install Spark

To install Spark, visit [spark.apache.org](https://spark.apache.org/docs/latest/api/python/getting_started/index.html)
Expand Down
41 changes: 41 additions & 0 deletions dataflow-mlflow-integration/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
## How To set MLFlow Tracking Server

Change the directory to modules
`cd modules`

In `modules/build.sh` the `build.sh` is a Bash script that allows you to execute various commands related to MySQL, Git, Docker, Firewall, and Image operations.

## Usage

The script accepts several command-line arguments to execute specific tasks. Here are the available options:

- `-a` or `--all`: Executes commands related to MySQL, Git, Docker, Firewall, and Image operations.
- `-g` or `--git`: Executes commands related to Git operations.
- `-d` or `--docker`: Executes commands related to Docker operations.
- `-s` or `--mysql`: Executes commands related to MySQL operations.
- `-f` or `--firewall`: Executes commands related to Firewall operations.
- `-i` or `--image`: Executes commands related to Image operations.

You can provide multiple command-line arguments at once to perform multiple tasks simultaneously.

## Example

To execute all the available commands, you can run the script with the `-a` or `--all` option:
`./build.sh -a`

## MLFlow Tracking Server
`mlflow.sh` script accepts several command-line arguments to set specific variables. Here are the available options:

- `-a` or `--mlflow-artifact-root`: Sets the `MLFLOW_DEFAULT_ARTIFACT_ROOT` variable to the provided value.
- `-s` or `--mlflow-artifacts-destination`: Sets the `MLFLOW_ARTIFACTS_DESTINATION` variable to the provided value.
- `-u` or `--mlflow-backend-store-uri`: Sets the `MLFLOW_BACKEND_STORE_URI` variable to the provided value.
- `-i` or `--docker-image`: Sets the `DOCKER_IMAGE` variable to the provided value.

You need to provide the corresponding values for each option.

## Example

To set the MLflow and Docker variables, you can run the script with the appropriate options and values. For example:

```bash
./mlflow.sh -a /path/to/artifact/root -s /path/to/artifacts/destination -u mysql+mysqlconnector://{username}:{password}@{host}:{db_port}/{db_name} -i mydockerimage:latest
41 changes: 41 additions & 0 deletions dataflow-mlflow-integration/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#!/bin/bash

# Read command-line arguments
while [[ $# -gt 0 ]]; do
key="$1"

case $key in
-a|--all)
source mysql.sh
source git.sh
source docker.sh
source firewall.sh
source image.sh
shift # past argument
shift # past value
;;
-g|--git)
source git.sh
shift # past argument
;;
-d|--docker)
source docker.sh
shift # past argument
;;
-s|--mysql)
source mysql.sh
shift # past argument
;;
-f|--firewall)
source firewall.sh
shift # past argument
;;
-i|--image)
source image.sh
shift # past argument
;;
*) # unknown option
shift # past argument
;;
esac
done
29 changes: 29 additions & 0 deletions dataflow-mlflow-integration/docker.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#!/bin/bash

# Function to check if a package is installed
package_exists() {
if rpm -q "$1" >/dev/null 2>&1; then
return 0 # Package exists
else
return 1 # Package does not exist
fi
}

# Install dependencies only if they are missing
if ! package_exists yum-utils; then
sudo yum install -y yum-utils
fi

if ! package_exists docker-ce; then
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
fi

# Start Docker service-Check if Docker is already running
if ! pgrep -x "dockerd" >/dev/null; then
echo "Docker is not running. Starting Docker..."
sudo dockerd >/dev/null 2>&1 &
echo "Docker started."
else
echo "Docker is already running. No changes needed."
fi
15 changes: 15 additions & 0 deletions dataflow-mlflow-integration/firewall.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/bin/bash

# Check if the port is already allowed
sudo firewall-cmd --zone=public --query-port=5000/tcp
port_status=$?

# Add the port if it is not already allowed
if [[ $port_status -ne 0 ]]; then
echo "Port 5000 is not allowed. Adding the rule..."
sudo firewall-cmd --zone=public --permanent --add-port=5000/tcp
sudo firewall-cmd --reload
echo "Firewall rules updated."
else
echo "Port 5000 is already allowed. No changes needed."
fi
30 changes: 30 additions & 0 deletions dataflow-mlflow-integration/git.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/bash

# Function to check if a package is installed
package_exists() {
if rpm -q "$1" >/dev/null 2>&1; then
return 0 # Package exists
else
return 1 # Package does not exist
fi
}

# Install dependencies
if ! package_exists git; then
sudo yum install -y git
fi

# Function to check if a directory exists
directory_exists() {
if [ -d "$1" ]; then
return 0 # Directory exists
else
return 1 # Directory does not exist
fi
}

# Clone the repository if it is not already cloned
if ! directory_exists oci-mlflow; then
git clone https://github.com/oracle/oci-mlflow.git
fi

16 changes: 16 additions & 0 deletions dataflow-mlflow-integration/image.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash

# Read inputs
# Read region input
read -p "Enter region: " region


# Read tenancy input
read -p "Enter tenancy [my-tenancy]: " tenancy

# Read tag input
read -p "Enter tag [my-tag]: " tag

cd oci-mlflow
sudo docker build -t "$region.ocir.io/$tenancy/oci-mlflow:$tag" --network host -f container-image/Dockerfile .
cd ..
47 changes: 47 additions & 0 deletions dataflow-mlflow-integration/mlflow.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
#!/bin/bash

# Read command-line arguments
while [[ $# -gt 0 ]]; do
key="$1"

case $key in
-a|--mlflow-artifact-root)
MLFLOW_DEFAULT_ARTIFACT_ROOT="$2"
shift # past argument
shift # past value
;;
-s|--mlflow-artifacts-destination)
MLFLOW_ARTIFACTS_DESTINATION="$2"
shift # past argument
shift # past value
;;
-u|--mlflow-backend-store-uri)
MLFLOW_BACKEND_STORE_URI="$2"
shift # past argument
shift # past value
;;
-i|--docker-image)
DOCKER_IMAGE="$2"
shift # past argument
shift # past value
;;
*) # unknown option
shift # past argument
;;
esac
done

# Run the docker command
sudo docker run --rm \
--name oci-mlflow \
--network host \
-e MLFLOW_HOST=0.0.0.0 \
-e MLFLOW_GUNICORN_OPTS='--log-level debug' \
-e MLFLOW_PORT=5000 \
-e MLFLOW_DEFAULT_ARTIFACT_ROOT="$MLFLOW_DEFAULT_ARTIFACT_ROOT" \
-e MLFLOW_ARTIFACTS_DESTINATION="$MLFLOW_ARTIFACTS_DESTINATION" \
-e BACKEND_PROVIDER=mysql \
-e MLFLOW_BACKEND_STORE_URI="$MLFLOW_BACKEND_STORE_URI" \
-e MLFLOW_SERVE_ARTIFACTS=1 \
-e OCIFS_IAM_TYPE=instance_principal \
"$DOCKER_IMAGE"
30 changes: 30 additions & 0 deletions dataflow-mlflow-integration/mysql.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/bash


# Function to check if a package is installed
package_exists() {
if rpm -q "$1" >/dev/null 2>&1; then
return 0 # Package exists
else
return 1 # Package does not exist
fi
}

if ! package_exists mysql-shell; then
sudo yum install -y mysql-shell
fi

# Create the database
echo "Enter the username of the Database:"
read username


echo "Enter the Hostname:"
read hostname
echo "Enter the Password:"
read -s password
echo "Enter the Database name:"
read dbname

# Create Database
mysqlsh "$username@$hostname" --password="$password" --sql -e "CREATE DATABASE $dbname;"

0 comments on commit 4234e28

Please sign in to comment.