Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sns barplot warning #1

Open
wants to merge 177 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
177 commits
Select commit Hold shift + click to select a range
55e80f2
Fix sns barplot warning
ddl-dave-heinicke Sep 22, 2022
d39dbe3
Update README.md
BryanDomino May 19, 2023
86f3857
Merge pull request #1 from BryanDomino/patch-1
BryanDomino May 22, 2023
338dead
added MLFlow tracking to the experiment scripts
BryanDomino May 22, 2023
eaee3e2
updated to remove DMM API Key and fix API call indenting
BryanDomino May 22, 2023
71ebd5a
updated experiment name to be projectname+username
BryanDomino May 23, 2023
b471e5d
Pushed from Domino: https://rev4mlops.domino-eval.com/workspace/domin…
BryanDomino May 24, 2023
0b2b8e8
Pushed from Domino: https://rev4mlops.domino-eval.com/workspace/domin…
BryanDomino May 24, 2023
0673a22
Pushed from Domino: https://rev4mlops.domino-eval.com/workspace/domin…
BryanDomino May 24, 2023
55cdbd1
Pushed from Domino: https://rev4mlops.domino-eval.com/workspace/domin…
BryanDomino May 24, 2023
8116e9f
added MLFlow experiments
BryanDomino May 24, 2023
4242b7a
updated image
BryanDomino May 25, 2023
608f0ef
Pushed from Domino: https://rev4mlops.domino-eval.com/workspace/domin…
BryanDomino May 25, 2023
acfeea1
Pushed from Domino: https://rev4mlops.domino-eval.com/workspace/domin…
BryanDomino May 25, 2023
95844dc
Pushed from Domino: https://rev4mlops.domino-eval.com/workspace/domin…
BryanDomino May 25, 2023
b23e684
Pushed from Domino: https://rev4mlops.domino-eval.com/workspace/domin…
BryanDomino May 25, 2023
31af91e
Pushed from Domino: https://rev4mlops.domino-eval.com/workspace/domin…
BryanDomino May 25, 2023
0030d29
Pushed from Domino: https://rev4mlops.domino-eval.com/workspace/domin…
BryanDomino May 25, 2023
86ee75d
Pushed from Domino: https://rev4mlops.domino-eval.com/workspace/domin…
BryanDomino May 25, 2023
57e61cd
Pushed from Domino: https://rev4mlops.domino-eval.com/workspace/domin…
BryanDomino May 25, 2023
3805f6a
Update README.md
dominopetter Sep 24, 2023
eb95d7a
Update README.md
dominopetter Sep 24, 2023
a9c4e2a
Fixed Workshop Environment name
dominopetter Sep 25, 2023
8464964
Fixed the TypeError: corr() got an unexpected keyword argument 'numer…
dominopetter Sep 25, 2023
8bdec27
Fixed numeric only issue
dominopetter Sep 25, 2023
d5a9b6d
Fixed code for 5.7 Compute Environment
dominopetter Sep 25, 2023
aedb6b9
Update README.md
dominopetter Sep 27, 2023
633e633
Update README.md
dominopetter Sep 27, 2023
2a40cf0
Update README.md
dominopetter Sep 27, 2023
ea35e8f
updated screenshots
dominopetter Nov 22, 2023
9beed57
updated readme
Nov 22, 2023
68b8385
updated readme
Nov 22, 2023
14b58f2
Merge pull request #2 from dominopetter/main
dominopetter Nov 28, 2023
59f40bc
Create MLOps-Best-Practices-Workshop-Admin-Guide.md
dominopetter Feb 5, 2024
c6b43bf
Pushed from Domino: https://ki.domino-eval.com/workspace/bryandomino/…
BryanDomino Feb 28, 2024
f895bec
Pushed from Domino: https://ki.domino-eval.com/workspace/bryandomino/…
BryanDomino Feb 28, 2024
1a7c092
Update README.md
dominopetter Mar 3, 2024
b4a5ba8
Add files via upload
dominopetter Mar 3, 2024
bd615ab
Update README.md
dominopetter Mar 3, 2024
4091d6c
Update README.md
BryanDomino Mar 4, 2024
2d91a63
Finished Lab 2.2 - Exploring Workspaces
Mar 12, 2024
edb22c8
Updated the dataset path in the scripts directory
Mar 12, 2024
4b7d14c
Updated more paths going from DFS to GIT
Mar 12, 2024
80dce91
reg model test
Mar 12, 2024
a4346e0
fixed sklearn
Mar 13, 2024
ba3bac7
fixed sklearn
Mar 13, 2024
5855996
auto register
Mar 13, 2024
2fa6afe
added run.py
Mar 13, 2024
643ea5b
removed run.py
Mar 13, 2024
e4f1f7d
Fixed pickle file path
Mar 13, 2024
24babb3
Adding app code
Mar 13, 2024
f414047
added pharma app
Mar 13, 2024
c63db2b
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Mar 13, 2024
584c9c6
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Mar 19, 2024
eca9b96
This is a command line commit
Mar 19, 2024
8c5b9e1
Updated app to Wine app
Mar 21, 2024
f94c711
Completed EDA notebook
Mar 26, 2024
a14d1e5
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Apr 12, 2024
f9674b9
fixed sklearn mse error
Apr 30, 2024
cc8371c
fixed sklearn ls error
Apr 30, 2024
8d38f69
fixed wine app
Apr 30, 2024
450d1e8
fixed notebook
May 1, 2024
64abb2d
Pushed from Domino: https://demo.eval.domino.tech/workspace/petter/ML…
May 1, 2024
41cf7b5
Pushed from Domino: https://demo.eval.domino.tech/workspace/petter/ML…
May 1, 2024
eb994e7
Pushed from Domino: https://demo.eval.domino.tech/workspace/petter/ML…
May 1, 2024
3511c33
Pushed from Domino: https://demo.eval.domino.tech/workspace/petter/ML…
May 9, 2024
59d630f
Pushed from Domino: https://demo.eval.domino.tech/workspace/petter/ML…
May 9, 2024
63cc43a
Pushed from Domino: https://demo.eval.domino.tech/workspace/petter/ML…
May 9, 2024
e6344b3
Update Workshop-Walkthrough.md
dominopetter May 31, 2024
f509fe9
Update Workshop-Walkthrough.md
dominopetter May 31, 2024
14aa745
Update Readme.md
dominopetter May 31, 2024
cea177d
Pushed from Domino: https://demo.eval.domino.tech/workspace/petter/Wi…
Jun 7, 2024
fac3292
Pushed from Domino: https://demo.eval.domino.tech/workspace/petter/Wi…
Jun 7, 2024
a337f48
Pushed from Domino: https://demo.eval.domino.tech/workspace/petter/Wi…
Jun 7, 2024
a4ed607
fixed the Shiny App
Jun 7, 2024
41cf1c6
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 16, 2024
534f74f
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 16, 2024
f8c1876
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 16, 2024
6b17c3c
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 16, 2024
3adbbce
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 16, 2024
dffc32b
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 16, 2024
0d876a1
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 16, 2024
cccc02d
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 17, 2024
51108dc
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 23, 2024
262f4f5
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 23, 2024
7895806
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 30, 2024
f9b232c
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 30, 2024
a35cccd
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 30, 2024
2d50aef
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 30, 2024
b1801f0
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 30, 2024
1461a48
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 30, 2024
6c5e271
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 31, 2024
5e9cf5d
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 31, 2024
7c37769
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 31, 2024
d8f74b5
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Jul 31, 2024
af032d0
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Aug 6, 2024
c08f1ec
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Aug 6, 2024
f580038
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Aug 6, 2024
fa48fa9
 Completed EDA notebook
Aug 12, 2024
bed9a8e
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Aug 12, 2024
e53bb29
Pushed from Domino: https://se-demo.domino.tech/workspace/petter/MLOp…
Aug 12, 2024
e400167
Create main.yml
dominopetter Aug 31, 2024
b3b1828
Update main.yml
dominopetter Aug 31, 2024
49466be
Added github action
Aug 31, 2024
f9e024a
Modified EDA
Aug 31, 2024
171a27e
Fixed EDA Notebook
Aug 31, 2024
437a426
Fixed PATH in scripts
Aug 31, 2024
6b5d888
Fixed MLFLOW
Aug 31, 2024
6d6c4c6
Updated ENV VAR
Aug 31, 2024
c709c68
Updated APP ENV VARS
Aug 31, 2024
14ecfca
Update main.yml with new envId
dominopetter Aug 31, 2024
e7ed51b
Fixed SHINY APP
Aug 31, 2024
abf5a22
Updated APP
Sep 1, 2024
f56e1c5
Updated ReadMe
Sep 1, 2024
ecd9da8
Demo commit
Sep 2, 2024
c69b80c
Ran EDA_Code Notebook
Sep 3, 2024
4125325
Fixed R
Sep 3, 2024
27a5ab4
Fixed h2o
Sep 3, 2024
aa4f8dc
Cleaned up notebook
Sep 7, 2024
e3efe19
Added Wine values in the App instead of Lead gen
Sep 9, 2024
1718193
Update Readme.md
dominopetter Sep 10, 2024
35e2cbb
Update shiny_app_wine.R for defense
dominopetter Sep 10, 2024
314a0c5
Update Readme.md
dominopetter Sep 10, 2024
319d3de
Fixed app label text
Sep 10, 2024
11b3343
Add files via upload
dominopetter Sep 10, 2024
1a89755
Pushed from Domino: https://danisawesome.domino-eval.com/workspace/pe…
Sep 10, 2024
0b5b2dc
Pushed from Domino: https://danisawesome.domino-eval.com/workspace/pe…
Sep 10, 2024
053c019
Pushed from Domino: https://danisawesome.domino-eval.com/workspace/pe…
Sep 10, 2024
643e783
Pushed from Domino: https://danisawesome.domino-eval.com/workspace/pe…
Sep 10, 2024
f95d719
Pushed from Domino: https://danisawesome.domino-eval.com/workspace/pe…
Sep 10, 2024
77ad507
Pushed from Domino: https://danisawesome.domino-eval.com/workspace/pe…
Sep 10, 2024
fdc38ec
Pushed from Domino: https://danisawesome.domino-eval.com/workspace/pe…
Sep 10, 2024
362fc7c
Pushed from Domino: https://presentation.domino-eval.com/workspace/pe…
Sep 12, 2024
b297d82
Pushed from Domino: https://presentation.domino-eval.com/workspace/pe…
Sep 12, 2024
ddbe490
Pushed from Domino: https://presentation.domino-eval.com/workspace/pe…
Sep 13, 2024
2127743
Cleaned up Jupyter notebook
Sep 18, 2024
5d048a5
Added Admin Readme
Sep 18, 2024
4a3e31d
Pushed from Domino: https://presentation.domino-eval.com/workspace/pe…
Sep 18, 2024
97ce6d4
Pushed from Domino: https://presentation.domino-eval.com/workspace/pe…
Sep 18, 2024
e1107df
Pushed from Domino: https://presentation.domino-eval.com/workspace/pe…
Sep 18, 2024
a7f33e4
Pushed from Domino: https://presentation.domino-eval.com/workspace/pe…
Sep 18, 2024
fc84a6e
Pushed from Domino: https://presentation.domino-eval.com/workspace/pe…
Sep 18, 2024
a7a9f6f
Pushed from Domino: https://presentation.domino-eval.com/workspace/pe…
Sep 18, 2024
9429743
Pushed from Domino: https://presentation.domino-eval.com/workspace/pe…
Sep 18, 2024
4a7c857
Pushed from Domino: https://presentation.domino-eval.com/workspace/pe…
Sep 18, 2024
804828c
Merge pull request #3 from dominopetter/main
dominopetter Oct 17, 2024
e9d724b
Pushed from Domino: https://disneyob.domino-eval.com/workspace/petter…
Oct 25, 2024
9e45221
Fixed MLFlow, scipy and scikit-learn incompatibility
Oct 27, 2024
b0d20e8
Merge pull request #4 from dominopetter/main
dominopetter Oct 27, 2024
7df709b
Pushed from Domino: https://disneyob.domino-eval.com/workspace/petter…
Oct 27, 2024
3cc86f2
Merge pull request #5 from dominopetter/main
dominopetter Oct 27, 2024
93c340c
Added ETL Flow
Nov 1, 2024
0b2ceeb
Modified ETL Flow
Nov 1, 2024
9ae0f28
Merge pull request #6 from dominopetter/main
dominopetter Nov 13, 2024
bcc361e
Pushed from Domino: https://ykb.domino-eval.com/workspace/petter/mlop…
Nov 13, 2024
96b77ac
Pushed from Domino: https://ykb.domino-eval.com/workspace/petter/mlop…
Nov 13, 2024
b205bd6
Pushed from Domino: https://ykb.domino-eval.com/workspace/petter/mlop…
Nov 13, 2024
20b6da1
Pushed from Domino: https://ykb.domino-eval.com/workspace/petter/mlop…
Nov 13, 2024
985086a
Pushed from Domino: https://ykb.domino-eval.com/workspace/petter/mlop…
Nov 14, 2024
e906daa
Pushed from Domino: https://ykb.domino-eval.com/workspace/petter/mlop…
Nov 14, 2024
9ce6a3f
Pushed from Domino: https://ykb.domino-eval.com/workspace/petter/mlop…
Nov 14, 2024
20b01d2
Pushed from Domino: https://ykb.domino-eval.com/workspace/petter/mlop…
Nov 15, 2024
7bb0788
Pushed from Domino: https://ykb.domino-eval.com/workspace/petter/mlop…
Nov 15, 2024
92234ab
Pushed from Domino: https://ykb.domino-eval.com/workspace/petter/mlop…
Nov 15, 2024
b9ae767
Completed EDA Notebook
Nov 17, 2024
b704e2f
Fixed dataset location in a template based workshop
Nov 17, 2024
5767504
Updated dataset path in the /scripts dir as it is now static
Nov 17, 2024
e0ac030
fixed for credit approval workshop
Nov 18, 2024
90a2d1c
Pushed from Domino: https://ykb.domino-eval.com/workspace/petter/mlop…
BryanDomino Nov 18, 2024
5ff76b1
Pushed from Domino: https://ykb.domino-eval.com/workspace/petter/mlop…
BryanDomino Nov 18, 2024
2441d72
Pushed from Domino: https://ykb.domino-eval.com/workspace/petter/mlop…
BryanDomino Nov 18, 2024
82a0ff8
Update streamlit_app.py
BryanDomino Nov 18, 2024
b24ec04
Delete .ipynb_checkpoints directory
ddl-danmennell Dec 16, 2024
48be8ce
Delete .Trash-12574 directory
ddl-danmennell Dec 16, 2024
960ea20
Update Readme.md
ddl-danmennell Dec 16, 2024
3003338
Update Readme.md
ddl-danmennell Dec 16, 2024
17f7d24
Update Readme.md
ddl-danmennell Dec 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
name: Trigger Job on Push

# Trigger the workflow on any push to the repository
on:
push:
branches:
- main # or specify another branch if needed

jobs:
trigger_job:
runs-on: ubuntu-latest

steps:
- name: STOP APPLICATION
run: |
response=$(curl -s -o /dev/null -w "%{http_code}" -X POST "https://se-demo.domino.tech/v4/modelProducts/66d2eebc90d1992080a4d2de/stop?force=false" \
-H "accept: application/json" \
-d "" \
-H "X-Domino-Api-Key: e6a89f042124ffb55918ca22d4dfa655f5c45393ef2cfb6538fb1b3802d9babf")
if [ "$response" -ne 200 ]; then
echo "Failed to stop the application. HTTP Status: $response"
exit 1
fi
shell: bash
continue-on-error: true

- name: START APPLICATION
run: |
response=$(curl -s -o /dev/null -w "%{http_code}" -X POST "https://se-demo.domino.tech/v4/modelProducts/66d2eebc90d1992080a4d2de/start" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d "{\"environmentId\":\"65f029e66fb33c0d9f186974\",\"hardwareTierId\":\"small-k8s\",\"externalVolumeMountIds\":[]}" \
-H "X-Domino-Api-Key: e6a89f042124ffb55918ca22d4dfa655f5c45393ef2cfb6538fb1b3802d9babf")
if [ "$response" -ne 200 ]; then
echo "Failed to start the application. HTTP Status: $response"
exit 1
fi
shell: bash
261 changes: 261 additions & 0 deletions EDA_code.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,261 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "e488cd55-9305-4d28-ab18-d38621bf2c0d",
"metadata": {},
"source": [
"## Data Source Access in JupyterLab\n",
"\n",
"This code snippet demonstrates how to interact with a data source using the `domino.data_sources.DataSourceClient` from the Domino Data Lab environment. Specifically, it performs the following operations:\n",
"\n",
"1. **Initialization**: Instantiates a `DataSourceClient` object to interact with the available data sources.\n",
"2. **Data Source Fetching**: Retrieves a specific data source instance named \"winequality\".\n",
"3. **Object Listing**: Lists all objects available in the \"winequality\" data source.\n",
"\n",
"The commented sections of the code provide examples of additional operations:\n",
"- **Binary Content Retrieval**: Shows how to fetch the binary content of a specified object.\n",
"- **File Download**: Illustrates downloading the content of a specified object to a local file.\n",
"- **File Object Download**: Demonstrates downloading content directly into a Python `io.BytesIO()` file object for further manipulation within the notebook."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e2745b72-251d-443e-89b9-3575823295e0",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from domino.data_sources import DataSourceClient\n",
"\n",
"# instantiate a client and fetch the datasource instance\n",
"object_store = DataSourceClient().get_datasource(\"mlops-best-practices\")\n",
"\n",
"# list objects available in the datasource\n",
"objects = object_store.list_objects()\n",
"\n",
"## get content as binary\n",
"# content = object_store.get(\"key\")\n",
"\n",
"## download content to file\n",
"# object_store.download_file(\"key\", \"./path/to/local/file\")\n",
"\n",
"## Download content to file object\n",
"# f = io.BytesIO()\n",
"# object_store.download_fileobj(\"key\", f)"
]
},
{
"cell_type": "markdown",
"id": "738c4a7a-e575-46c3-95ab-e25b45369922",
"metadata": {},
"source": [
"## Data Loading and Display in JupyterLab\n",
"\n",
"This code snippet is designed to demonstrate the process of loading and displaying data within a JupyterLab environment, particularly using the `pandas` library for handling CSV data. The operations performed are as follows:\n",
"\n",
"1. **Data Retrieval**: Retrieves the binary content of the \"WineQualityData.csv\" file from a data source, converting it to a UTF-8 string.\n",
"2. **String to Data Stream**: Converts the string data into a stream using `StringIO`, making it readable by pandas.\n",
"3. **Data Frame Creation**: Loads the data into a pandas DataFrame by reading from the StringIO object.\n",
"4. **Display Data**: Displays the first few rows of the DataFrame to provide a snapshot of the dataset.\n",
"\n",
"This snippet is particularly useful for quickly visualizing the structure and a portion of the data directly from a data source managed by Domino's `DataSourceClient`.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e9c04968-a681-4715-88ce-b9b68cc927e3",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from io import StringIO\n",
"import pandas as pd\n",
"\n",
"s = str(object_store.get(\"credit_card_default.csv\"), 'utf-8')\n",
"data = StringIO(s)\n",
"\n",
"# Load only the specified columns\n",
"columns_to_load = ['ID', 'PAY_0', 'PAY_2', 'PAY_4', 'LIMIT_BAL', 'PAY_3', 'BILL_AMT1', 'default payment next month']\n",
"df = pd.read_csv(data, usecols=columns_to_load)\n",
"\n",
"# Rename the column after loading the data\n",
"df.rename({'default payment next month': 'DEFAULT'}, axis=1, inplace=True)\n",
"\n",
"df.head()"
]
},
{
"cell_type": "markdown",
"id": "5832f796-419b-4a8e-87a2-66cf212a276c",
"metadata": {},
"source": [
"## Visualizing Data Correlations in JupyterLab\n",
"\n",
"This code snippet uses Python libraries `seaborn` and `matplotlib` to visualize correlations between numeric features of a dataset within a JupyterLab environment. The operations performed include:\n",
"\n",
"1. **Column Creation**: Adds a new column `is_red` to the DataFrame `df`. This column is a binary indicator where 1 represents 'red' wine types based on the `type` column of the DataFrame.\n",
"2. **Figure Setup**: Sets up a figure with a specified size (10x10 inches) using `matplotlib`.\n",
"3. **Heatmap Generation**: Generates a heatmap of the correlation matrix of numeric-only columns in `df` using `seaborn`. Correlation values are annotated and formatted to one decimal place.\n",
"\n",
"This visualization is helpful for identifying relationships between different numeric features, especially in contexts like feature selection or initial data analysis."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fb7bbd22-532e-4c71-b3c6-c3b0ad5ecd4d",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"fig = plt.figure(figsize=(10,10))\n",
"sns.heatmap(df.corr(numeric_only=True), annot = True, fmt='.1g')"
]
},
{
"cell_type": "markdown",
"id": "c43a0bd9-b79c-4b47-8db2-826421a4990f",
"metadata": {},
"source": [
"## Feature Importance Visualization\n",
"\n",
"This code snippet demonstrates how to identify and visualize important features related to the 'quality' variable of a dataset using Python libraries `seaborn` and `matplotlib`. The snippet performs the following steps:\n",
"\n",
"1. **Correlation Calculation**: Computes the Pearson correlation coefficients between all numeric features and the 'quality' feature of the DataFrame `df`.\n",
"2. **Sorting and Filtering**: Sorts these coefficients by their values associated with 'quality' and filters out the 'quality' column itself. It then selects features with an absolute correlation value greater than 0.08, considering these as important features.\n",
"3. **Visualization Setup**: Sets the theme for the plot using `seaborn` and initializes a figure with a size of 16x5 inches.\n",
"4. **Bar Plot Creation**: Creates a bar plot to display the Pearson correlation values of the identified important features. The plot has a title and labels for clarity, and uses a 'seismic_r' color palette to differentiate the values.\n",
"\n",
"This approach is useful for quickly identifying which features have a significant correlation with the target variable 'quality', aiding in feature selection and preliminary data analysis."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "17a5dd2e-e372-4f24-a9dd-73e32754a0e3",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Calculate the correlation and sort by 'DEFAULT'\n",
"corr_values = df.corr(numeric_only=True).sort_values(by='DEFAULT')['DEFAULT']\n",
"important_feats = corr_values[abs(corr_values) > 0.08]\n",
"print(important_feats)\n",
"\n",
"# Set the theme\n",
"sns.set_theme(style=\"darkgrid\")\n",
"\n",
"# Prepare the figure\n",
"plt.figure(figsize=(16, 5))\n",
"plt.title('Feature Importance for Credit Scoring')\n",
"plt.ylabel('Pearson Correlation')\n",
"\n",
"# Create a barplot without a palette argument, using a default color temporarily\n",
"ax = sns.barplot(x=important_feats.keys(), y=important_feats.values, color='gray')\n",
"\n",
"# Get colors from the 'seismic_r' palette based on the number of entries\n",
"palette = sns.color_palette(\"seismic_r\", len(important_feats))\n",
"\n",
"# Set the colors for each bar individually\n",
"for bar, color in zip(ax.patches, palette):\n",
" bar.set_color(color)\n",
"\n",
"# Show the plot\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "23e96d14-249c-4010-a346-7609a91be0c7",
"metadata": {},
"source": [
"## Histogram Visualization of Important Features\n",
"\n",
"This code snippet is designed to visualize the distribution of important features identified as having a significant correlation with wine quality, along with the distribution of the quality itself, using Python libraries `seaborn` and `matplotlib`. The snippet executes the following steps:\n",
"\n",
"1. **Loop Through Features**: Iterates over the keys of the `important_feats` dictionary (features with a strong correlation to 'quality') and includes the 'quality' column itself.\n",
"2. **Histogram Plotting**:\n",
" - For each feature in the loop, it initializes a new figure with a predefined size (8x5 inches).\n",
" - Sets a title specific to the feature being plotted.\n",
" - Uses `seaborn.histplot` to create a histogram with a kernel density estimate (KDE) overlay for each feature. This helps in visualizing the distribution and density of the data points.\n",
"\n",
"This method provides a detailed look at the distribution characteristics of each key feature, assisting in understanding the variability and distribution trends within the data set."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fc88b823-03c7-413b-b194-4b60b95f4258",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"for i in list(important_feats.keys()) + ['DEFAULT']:\n",
" plt.figure(figsize=(8, 5))\n",
" plt.title(f'Histogram of {i}')\n",
" sns.histplot(df[i].dropna(), kde=True)"
]
},
{
"cell_type": "markdown",
"id": "e84b43d0-5fa3-410d-b919-3250b675a97c",
"metadata": {},
"source": [
"## Saving DataFrame to CSV in a Project-Specific Path\n",
"\n",
"This code snippet demonstrates how to save a pandas DataFrame to a CSV file in a project-specific directory within the JupyterLab environment. The snippet carries out the following operations:\n",
"\n",
"1. **Path Construction**: Constructs the file path using the environment variable `DOMINO_PROJECT_NAME` to dynamically create a directory path within `/mnt/data/`. This path points to where the 'WineQualityData.csv' will be saved, ensuring the file location is relative to the current Domino project.\n",
"2. **Save DataFrame**: Utilizes the `to_csv` method of the pandas DataFrame `df` to write the DataFrame to the constructed path without including the index column in the output file.\n",
"\n",
"This approach ensures that the output CSV file is easily accessible within the specific context of the current Domino project, promoting better organization and data management practices."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2f729616-c760-4d59-94bd-a83c35645d88",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import os\n",
"path = str('/mnt/data/mlops-best-practices/credit_card_default.csv')\n",
"df.to_csv(path, index = False)"
]
}
],
"metadata": {
"dca-init": "true",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
15 changes: 15 additions & 0 deletions Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Domino Hands-On Workshop: Predictions

#### In this workshop, you will work through an end-to-end workflow broken into various labs to -

* Read in data from a live datasource
* Prepare your data in an IDE of your choice, with an option to leverage distributed computing clusters
* Train several models in various frameworks
* Compare model performance across different frameworks and select the best-performing model
* Deploy model to a containerized endpoint and web-app frontend for consumption
* Leverage collaboration and documentation capabilities to make all work reproducible and sharable!

You will find a full walkthrough of our Workshop here: [VERSION 6.x WORKSHOP LINK](https://docs.google.com/document/u/4/d/11eA3ney10KzX7GF9G7f5n72f4p7k7CHSpLxoUfbAGE8/pub)

You will find a full walkthrough of our Workshop here: [VERSION 5.x WORKSHOP LINK](https://docs.google.com/document/d/e/2PACX-1vS9LKbBYYOrsDmshmKvEIUkDMYVMAivoodg1CTEgjZRPW_IJFV2Un4l5uaE2jI1BsbN3-tQ8IMSkGoL/pub)

12 changes: 12 additions & 0 deletions admin/.ipynb_checkpoints/Admin-Guide-checkpoint.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Dockerfile Instructions
## Additional Dockerfile Instructions
```
RUN pip install --upgrade pandas seaborn
RUN pip install h2o
```

# Goals
Add one *Goal* called *Explore Data* and set it to *Data Acquisition and Exploration*

# Apps
In the apps directory you will find three apps; wine, defense, and lead_gen. Simple change the app.sh file to the one that is most appropriate and the Rshiny app will reflect that industry.
12 changes: 12 additions & 0 deletions admin/Admin-Guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Dockerfile Instructions
## Additional Dockerfile Instructions
```
RUN pip install --upgrade pandas seaborn
RUN pip install h2o
```

# Goals
Add one *Goal* called *Explore Data* and set it to *Data Acquisition and Exploration*

# Apps
In the apps directory you will find three apps; wine, defense, and lead_gen. Simple change the app.sh file to the one that is most appropriate and the Rshiny app will reflect that industry.
15 changes: 15 additions & 0 deletions app.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
mkdir ~/.streamlit
echo "[browser]" > ~/.streamlit/config.toml
echo "gatherUsageStats = true" >> ~/.streamlit/config.toml
echo "serverAddress = \"0.0.0.0\"" >> ~/.streamlit/config.toml
echo "serverPort = 8888" >> ~/.streamlit/config.toml
echo "[server]" >> ~/.streamlit/config.toml
echo "port = 8888" >> ~/.streamlit/config.toml
echo "enableCORS = false" >> ~/.streamlit/config.toml
echo "enableXsrfProtection = false" >> ~/.streamlit/config.toml

streamlit run apps/streamlit_app.py

# python app/rai.py

# R -e 'shiny::runApp("/mnt/scripts/shiny_app.R", port=8888, host="0.0.0.0")'
Loading