Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples documentation #124

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions examples/cloud_vm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# ☁️ Cloud Infrastructure Examples

Welcome to the **Cloud Infrastructure** examples! This section contains ready-to-use configurations to help you deploy scalable MLOps stacks on cloud platforms like AWS. Each example is designed to showcase a different tool or workflow, making it easier for you to find and deploy the infrastructure that fits your needs.

## 🚀 Getting Started

### Prerequisites
Before you begin, ensure you have the following:
- **Terraform**: Version `>= 1.8.0` installed on your system. [Install Terraform](https://learn.hashicorp.com/tutorials/terraform/install-cli)
- **AWS Account**: Make sure your AWS account is set up and you have appropriate IAM roles.
- **AWS CLI**: Ensure your AWS CLI is configured with the correct credentials. [Configure AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html)
- **Python**: Version `>= 3.7` to run the setup scripts.

### 📦 Installation
1. **Create a Virtual Environment**:
```bash
python -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
```
2. **Install `mlinfra`**:
```bash
pip install mlinfra
```

3. **Choose an Example**:
Navigate to the example you want to deploy:
```bash
cd examples/cloud_infra/dagster
```

4. **Configure AWS Credentials**:
Modify the `terraform.tfvars` file to update your AWS account details. Make sure your AWS credentials are correctly configured.

5. **Deploy**:
Use `mlinfra` to apply the configuration:
```bash
mlinfra terraform --action apply --stack-config-path ./config/aws-config.yaml
```

## 📂 Examples Available

| Example Folder | Description |
| -------------- | ----------- |
| `dagster` | Deploy **Dagster** for orchestrating your ML pipelines on AWS |
| `mlflow` | Set up **MLflow** for experiment tracking and model management |
| `wandb` | Configure **Weights & Biases** for experiment tracking on cloud |
| `lakefs` | Create a **LakeFS** setup for versioning your datasets in the cloud |
| `prefect` | Set up **Prefect** for orchestrating cloud-based ML workflows |

## 🌟 Why Use Cloud Infrastructure?
Deploying your MLOps stack on the cloud brings several advantages:
- **Scalability**: Easily scale resources up or down based on workload.
- **Cost Efficiency**: Pay for what you use, and reduce costs with managed services.
- **Flexibility**: Leverage powerful cloud services to enhance your ML workflows.

## 🆘 Need Help?
Feel free to open an issue or join our [Discord community](#) for discussions, troubleshooting, or contributing new ideas.
65 changes: 65 additions & 0 deletions examples/cloud_vm/dagster/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# 🚀 Dagster on AWS Cloud Setup

Welcome to the **Dagster AWS Cloud Setup** documentation! This guide will help you deploy Dagster on AWS using the provided configurations for a seamless data orchestration experience. Let's get started! 🛠️

## 📂 Configuration Files Overview

### 1. `aws-dagster.yaml`
This file provides a **basic setup** for deploying Dagster on AWS. It includes essential configurations to quickly get you up and running with Dagster. Use this file if you are just starting out or need a straightforward deployment.

**Features:**
- 🏗️ Basic infrastructure setup
- 🧩 Core components of Dagster
- 🔌 Integration with AWS services like S3 and CloudWatch

### 2. `aws-dagster-advanced.yaml`
This file offers a **more advanced setup** for deploying Dagster on AWS, with additional configurations for better scalability, security, and performance. Use this file if you need more control and features for your deployment.

**Features:**
- 📈 Enhanced scalability and performance configurations
- 🔒 Advanced security settings
- 🔄 Custom integrations and multi-environment support

## 📦 Prerequisites

Before deploying Dagster on AWS, make sure you have:
1. 🏷️ AWS Account with appropriate permissions.
2. 🏗️ [AWS CLI](https://aws.amazon.com/cli/) installed and configured.
3. 🧩 [Dagster](https://dagster.io/) installed on your local machine.
4. 🔧 Proper IAM roles and security groups set up.

## 🚀 Deployment Instructions

### Basic Setup (`aws-dagster.yaml`)
1. **Configure your AWS CLI:**
```bash
aws configure
```
2. **Deploy the infrastructure:**
```bash
aws cloudformation deploy --template-file aws-dagster.yaml --stack-name DagsterBasicSetup
```
3. 🎉 **You're all set!** Access Dagster via the provided endpoint.

### Advanced Setup (`aws-dagster-advanced.yaml`)
1. **Configure your AWS CLI:**
```bash
aws configure
```
2. **Deploy the infrastructure:**
```bash
aws cloudformation deploy --template-file aws-dagster-advanced.yaml --stack-name DagsterAdvancedSetup
```
3. **Customize the parameters if needed:**
Modify the `.yaml` file to suit your specific needs, including scaling options, security settings, and more.
4. 🎉 **Enjoy your advanced setup!**

## 📝 Notes
- Ensure the **VPC settings** and **subnets** are correctly configured to avoid deployment issues.
- For the advanced setup, you might need to adjust **IAM roles** and **network configurations**.

## 📚 Additional Resources
- 📖 [Dagster Documentation](https://docs.dagster.io/)
- 📖 [AWS CloudFormation Documentation](https://docs.aws.amazon.com/cloudformation/)

Feel free to customize the `.yaml` files to suit your specific requirements. 🛠️ Happy deploying!
91 changes: 91 additions & 0 deletions examples/cloud_vm/lakefs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# 🌊 AWS LakeFS Configuration

This directory contains configuration files for deploying LakeFS on AWS. LakeFS is an open-source data version control system designed for object storage, enabling you to manage and version your data efficiently.

## 📁 Files Included

- **`aws-lakefs.yaml`**: Basic configuration for deploying LakeFS on AWS.
- **`aws-lakefs-advanced.yaml`**: Advanced configuration for deploying LakeFS with additional features and custom settings.

## 🚀 Getting Started

To use these configurations, ensure you have the following prerequisites:

- An active AWS account.
- AWS Command Line Interface (CLI) installed and configured.
- Necessary IAM permissions for creating resources.
- Python 3.x installed.

### 📦 Installation

To install the `platinfra` package, you can use pip. It's recommended to create a Python virtual environment first:

```bash
python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
pip install platinfra
```

## 🌐 Deployment Instructions

### 1. Basic LakeFS Setup

The `aws-lakefs.yaml` file provides a straightforward setup for deploying LakeFS. This is suitable for initial testing and smaller workloads.

**To deploy:**

```bash
platinfra terraform --action apply --stack-config-path aws-lakefs.yaml
```

This command will create the necessary AWS resources and deploy LakeFS.

### 2. Advanced LakeFS Setup

The `aws-lakefs-advanced.yaml` file contains a more comprehensive configuration, including features like:

- Enhanced scaling options
- Custom resource requests and limits
- Integrations with additional AWS services

**To deploy:**

```bash
platinfra terraform --action apply --stack-config-path aws-lakefs-advanced.yaml
```

Feel free to modify the advanced configuration according to your specific use cases and resource requirements.

## 🔧 Customization Tips

- **IAM Roles**: Ensure that the necessary IAM roles and policies are in place to allow LakeFS to access required AWS resources, such as S3.
- **Environment Variables**: Adjust environment variables in the YAML files for configuring LakeFS settings.
- **Storage Configuration**: Make sure to configure persistent storage options if needed.

## 📦 Managing Your Deployment

- **Update Configuration**: To apply changes, use:
```bash
platinfra terraform --action apply --stack-config-path <your-updated-file>.yaml
```
- **Check Status**: Monitor the deployment using:
```bash
aws cloudformation describe-stacks --stack-name lakefs-stack
```
- **Logs**: View logs for troubleshooting in the AWS Console.

## 🧹 Cleanup

To remove the LakeFS deployment, execute:

```bash
platinfra terraform --action destroy --stack-config-path aws-lakefs.yaml
platinfra terraform --action destroy --stack-config-path aws-lakefs-advanced.yaml
```

## 📞 Support & Resources

For additional support, consider the following resources:

- [LakeFS Documentation](https://docs.lakefs.io/)
- [AWS Documentation](https://aws.amazon.com/documentation/)
70 changes: 70 additions & 0 deletions examples/cloud_vm/mlflow/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# MLflow on AWS Deployment 🚀

This README provides guidance on deploying MLflow on AWS using the provided configuration files in the `mlflow` folder.

## Overview

MLflow is an open-source platform to manage the ML lifecycle, including experimentation, reproducibility, and deployment. This setup allows you to deploy MLflow on AWS easily.

## Requirements

Before you start, ensure you have the following:

- An AWS account with the necessary permissions to create and manage resources.
- Terraform installed on your system (version >= 1.4.0).

## Installation

1. **Create a Python Virtual Environment:**

```bash
python -m venv venv
source venv/bin/activate
```

2. **Install the Required Packages:**

```bash
pip install platinfra
```

## Deployment Configuration

You can choose between two deployment configurations based on your needs:

### 1. Basic MLflow Deployment

- **File:** `aws-mlflow.yaml`

This configuration is suitable for a simple MLflow deployment with essential features. To deploy:

```bash
platinfra terraform --action apply --stack-config-path <path-to-your-config>/aws-mlflow.yaml
```

### 2. Advanced MLflow Deployment

- **File:** `aws-mlflow-advanced.yaml`

This configuration includes additional features for a more robust MLflow deployment. To deploy:

```bash
platinfra terraform --action apply --stack-config-path <path-to-your-config>/aws-mlflow-advanced.yaml
```

## Configuration

- Update the `aws-mlflow.yaml` or `aws-mlflow-advanced.yaml` files with your AWS account details and any other custom configurations you require.
- Ensure your AWS credentials are properly configured on your machine.

## Additional Information

For more details on using MLflow and its capabilities, refer to the [MLflow Documentation](https://www.mlflow.org/docs/latest/index.html).

## Contribution

Feel free to contribute to this project! If you have suggestions or improvements, open an issue or pull request.

## License

This project is licensed under the Apache-2.0 License.
57 changes: 57 additions & 0 deletions examples/cloud_vm/prefect/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Deploying Prefect on AWS ☁️

This guide provides instructions for deploying Prefect on Amazon Web Services (AWS) using the provided YAML configuration files.

## Overview 📊

Prefect is a modern workflow orchestration tool designed to enable efficient data pipeline management. This deployment will help you run Prefect on AWS, providing scalability and reliability for your workflows.

## Requirements ✔️

Before you start, ensure you have the following:
- **Terraform**: Version **>= 1.4.0** installed on your machine.
- **AWS Account**: A valid AWS account with necessary permissions to create resources.

## Installation ⚙️

1. **Create a Python Virtual Environment:**
```bash
python -m venv venv
source venv/bin/activate
```

2. **Install the Required Python Package:**
```bash
pip install platinfra
```

## Deployment Configuration 📄

In the `prefect` folder, you have the following configuration files:

1. **aws-prefect.yaml**: Basic configuration for deploying Prefect on AWS.
2. **aws-prefect-advanced.yaml**: Advanced configuration with additional settings and optimizations.

### Configuration File Details

- **aws-prefect.yaml**: This file sets up a simple Prefect server instance on AWS with default settings.

- **aws-prefect-advanced.yaml**: This file includes advanced features, such as a larger instance type, additional storage options, and monitoring configurations for enhanced performance.

## Running the Deployment 🚀

To deploy Prefect on AWS, follow these steps:

1. **Copy the Configuration File:**
Choose one of the configuration files (basic or advanced) and modify it with your AWS account details and desired configurations.

2. **Deploy the Configuration:**
Run the following command to apply the configuration:
```bash
platinfra terraform --action apply --stack-config-path <path-to-your-config-file>
```
Replace `<path-to-your-config-file>` with the path to either `aws-prefect.yaml` or `aws-prefect-advanced.yaml`.

## Conclusion 🎉

Congratulations! You have successfully deployed Prefect on AWS. You can now start orchestrating your workflows in the cloud. For further configurations and integrations, refer to the Prefect documentation.
Loading