Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dgen docker and packer support #54

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
5cd981c
setting up Dockerfiles and Docker compose for dgen
thedavidwhiteside Jan 18, 2025
a9bb8fe
setting up Packer build for dgen for AWS AMI publish
thedavidwhiteside Jan 18, 2025
75c39bb
added variables to allow customization
thedavidwhiteside Jan 24, 2025
e17b1bc
Fix docker-compose and instructions
thedavidwhiteside Jan 24, 2025
4219b3a
added complete usage instructions for packer
thedavidwhiteside Jan 29, 2025
db74d98
add packer usage guide to main README.md
thedavidwhiteside Jan 29, 2025
c0f7b73
added more details for packer readme
thedavidwhiteside Jan 29, 2025
85e9631
fix bug with packer auto dgen start at login
thedavidwhiteside Jan 29, 2025
3772474
fix bug with packer auto dgen start at login
thedavidwhiteside Jan 29, 2025
1622180
limit postgres to localhost only
thedavidwhiteside Jan 29, 2025
b6df9cd
update packer/ec2 instructions and disable dgen from auto starting on…
thedavidwhiteside Jan 29, 2025
20f8eff
fix packer ami date tag
thedavidwhiteside Jan 29, 2025
05b1c6d
small documentation improvement on formatting
thedavidwhiteside Jan 29, 2025
3018c09
small documentation improvement on formatting
thedavidwhiteside Jan 29, 2025
d2386ef
fixed link in documentation for packer and re-organized list of instr…
thedavidwhiteside Jan 29, 2025
f3399f1
fix small issue with packer dgen_start.sh not setting ownership corre…
thedavidwhiteside Jan 29, 2025
ac26e8f
fix bug with packer startup of postgres, the issue was permission rel…
thedavidwhiteside Jan 30, 2025
bc6e07a
add dgen usage as part of the login message
thedavidwhiteside Jan 30, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ The Distributed Generation Market Demand (dGen) Model
- [Webinar and Setup Tutorial](https://youtu.be/-Te5_KKZR8o)
- [Official dGen Documentation](https://nrel.github.io/dgen/)
- [Wiki](https://github.com/NREL/dgen/wiki)
- [dGen Docker Usage Guide](./docker/README.md)
- [dGen Packer Usage Guide](./packer/README.md)


## Get Your Tools
Expand Down
144 changes: 144 additions & 0 deletions docker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# dGen Docker Usage Guide

By default, the dgen container uses the [Deleware residential dataset](https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/dgen_db.sql).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
By default, the dgen container uses the [Deleware residential dataset](https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/dgen_db.sql).
By default, the dgen container uses the [Delaware residential agent file and SQL database](https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/dgen_db.sql).

Spelling and just consistency for naming/references of the files


You can customize the dataset, see the [Customizing the Dataset](#customizing-the-dataset) section below.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can customize the dataset, see the [Customizing the Dataset](#customizing-the-dataset) section below.
See the [Customizing the Model](#customizing-the-dataset) section below to learn how to use a different agent and SQL database and update the model scenario.

Overall clarity here. But wondering your thoughts on changing the section name to "Customizing the Model" instead of "Dataset"


### Mac and Linux quick start

This quickstart uses docker-compose to run dgen. The default path to store dgen data files and excel configurations is ~/dgen_data/. This path is shared with your running containers, you can change this path but you will need to edit the `docker-compose.yml` to reflect the data directory of your choice.

Prerequisites assume you are using a Mac and you already installed [Docker Desktop](https://docs.docker.com/desktop/setup/install/mac-install/)

##### Create the data directory
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
##### Create the data directory
##### Create the ```dgen_data``` directory in your root directory

```bash
$ mkdir -p ~/dgen_data/
$ chmod 755 ~/dgen_data/
$ ls -l ~/dgen_data/ # Its expected to be empty, after starting dgen you will see data files in this location.
```

##### Startup the dgen containers
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
##### Startup the dgen containers
##### Startup the dGen containers

``` bash
$ cd dgen/docker/
$ docker-compose up --build -d
[+] Running 2/2
✔ Container dgen_1 Started 0.1s
✔ Container postgis_1 Started 0.0s
```

##### Connect to the running containers
``` bash
$ docker attach dgen_1 # Attach to dgen environment container
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$ docker attach dgen_1 # Attach to dgen environment container
$ docker attach dgen_1 # Attach to dGen environment container

Would it be better to remove the word "environment" from the comment to avoid any confusion with the Conda environment? Fine with whichever you think is better!

$ docker attach $(sudo docker ps --filter "name=dgen" --format "{{.ID}}") # If dgen_1 is not found
(dg3n) dgen@cc6e2e5f70b5:/opt/dgen_os/python$ python dgen_model.py # Run scenario
(dg3n) dgen@cc6e2e5f70b5:/opt/dgen_os/python$ exit # to exit
$ docker-compose up -d # If you exit, you have to re-up the container if you want to re-attach
```

### Troubleshooting common issues

#### psycopg2.OperationalError: connection to server

Wait 5-10 minutes for the postgres database to finish starting.

#### General errors and issues

Try clearing your `~/dgen_data` and starting over. Make sure to provide time for the datasets to fully download on the re-attempt.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Try clearing your `~/dgen_data` and starting over. Make sure to provide time for the datasets to fully download on the re-attempt.
Try removing any datasets from your `~/dgen_data` and starting over. Make sure to provide time for the datasets to fully download on the re-attempt.


```bash
$ docker-compose down
$ rm -f ~/dgen_data/*
$ docker system prune -a
$ docker volume prune -f
```

### Disabling auto-start for the dgen virtual environment

By default, logging into the `dgen` container automatically activates the `dgen` virtual environment. For ease of use, its recommended to leave this the default.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
By default, logging into the `dgen` container automatically activates the `dgen` virtual environment. For ease of use, its recommended to leave this the default.
By default, logging into the `dgen` container automatically activates the `dg3n` Conda environment. For ease of use, its recommended to leave this the default.

Additionally, would switching "logging into" to "starting" make more sense here?


To disable this behavior, edit the `docker-compose.yml` file in this directory and set the following environment variable:

```yaml
services:
dgen:
environment:
DGEN_DISABLE_AUTO_START: 1
```

### Customizing the dataset

By default, the dgen container uses the [Deleware residential dataset](https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/dgen_db.sql).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
By default, the dgen container uses the [Deleware residential dataset](https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/dgen_db.sql).
By default, the dGen container uses the [Delaware residential agent file and SQL database](https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/dgen_db.sql).


You can find more datasets using the links below:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can find more datasets using the links below:
You can find additional agent files and their matching SQL database files for the contiguous United states using the links below:

- [DGen Dataset Submissions on OpenEI](https://data.openei.org/submissions/1931)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [DGen Dataset Submissions on OpenEI](https://data.openei.org/submissions/1931)
- [dGen Agent Files and SQL Databases on OpenEI](https://data.openei.org/submissions/1931)

- [DGen Dataset S3 Viewer](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=dgen%2F)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [DGen Dataset S3 Viewer](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=dgen%2F)
- [dGen Dataset S3 Viewer](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=dgen%2F)


You can customize the dataset used by overriding the DGEN_DATAFILE_URL and DGEN_AGENTFILE_URL variables in `docker-compose.yml` and then editing `~/dgen_data/input_sheet_final.xlsm` using Excel.

Below will walk through the process of using the [Colorado residential dataset](https://oedi-data-lake.s3.amazonaws.com/dgen/co_final_db/dgen_db.sql).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Below will walk through the process of using the [Colorado residential dataset](https://oedi-data-lake.s3.amazonaws.com/dgen/co_final_db/dgen_db.sql).
Below will walk through the process of updating the `docker-compose.yml` file to use the [Colorado residential agent file and SQL database file](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=dgen%2Fco_final_db%2F).

Switched the hyperlink to just the landing page of the CO files so the added instructions below will make more sense to the users


Update the `docker-compose.yml` to use the co_final_db sql download and set the variable to force remove the database. This will result in dataloss from previous runs, if this is a concern, please make backups before proceeding with the below steps.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Update the `docker-compose.yml` to use the co_final_db sql download and set the variable to force remove the database. This will result in dataloss from previous runs, if this is a concern, please make backups before proceeding with the below steps.
First, copy the link addresses from the download links of the desired agent file and the relevant SQL database file. Then, update `docker-compose.yml` to download and use the files of choice for the model runs by inserting the copied agent file link next to `DGEN_DATAFILE_URL:` and the copied SQL database file link next to `DGEN_AGENTFILE_URL:`. If this is not the first time you are setting up the Docker Container, set the `DGEN_FORCE_DELETE_DATABASE` variable to `1` to force remove any previously created database. This will result in dataloss from previous runs; if this is a concern, please make backups before proceeding with the following steps.


```yaml
services:
postgis:
environment:
DGEN_DATAFILE_URL: https://oedi-data-lake.s3.amazonaws.com/dgen/co_final_db/dgen_db.sql
DGEN_AGENTFILE_URL: https://oedi-data-lake.s3.amazonaws.com/dgen/co_final_db/agent_df_base_res_co_revised.pkl
DGEN_FORCE_DELETE_DATABASE: 1 # Clear all the data in the database to reload the Colorado dataset, Warning this will remove your existing data.
```

Edit the excel document `~/dgen_data/input_sheet_final.xlsm` using Excel (Enable macros), edit the Region to Analyize to `Colorado` and Markets to `Only Residential`, then click Save Scenario.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Edit the excel document `~/dgen_data/input_sheet_final.xlsm` using Excel (Enable macros), edit the Region to Analyize to `Colorado` and Markets to `Only Residential`, then click Save Scenario.
Edit the excel document `~/dgen_data/input_sheet_final.xlsm` using Excel (Enable macros), and set the 'Region to Analyze' to `Colorado` and 'Markets' to `Only Residential`, then click Save Scenario.

Spelling and clarity


Restart your containers with the above options. This will remove all your existing data and download the new Colorado dataset.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Restart your containers with the above options. This will remove all your existing data and download the new Colorado dataset.
Next, restart your containers using the above options. This will remove all your existing data and setup your Docker Container to use the Colorado files instead.


```bash
$ docker-compose down
[+] Running 3/3
✔ Container dgen_1 Removed 9.2s
✔ Container postgis_1 Removed 0.1s

$ docker-compose up -d
[+] Running 2/2
✔ Container dgen_1 Started 0.1s
✔ Container postgis_1 Started 0.2s
```

After you load the new Colorado dataset, remove the DGEN_FORCE_DELETE_DATABASE option to prevent future accidental data loss.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
After you load the new Colorado dataset, remove the DGEN_FORCE_DELETE_DATABASE option to prevent future accidental data loss.
After you load the new Colorado dataset, reset the `DGEN_FORCE_DELETE_DATABASE` option to "0" to prevent future accidental data loss.


```yaml
services:
postgis:
environment:
DGEN_FORCE_DELETE_DATABASE: 0
```

You can now attach to the dgen container and monitor the data download. This may take 5-10 minutes depending on your internet speed, if the file size is increasing its still downloading.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can now attach to the dgen container and monitor the data download. This may take 5-10 minutes depending on your internet speed, if the file size is increasing its still downloading.
You can now attach to the dGen container and monitor the data download. This may take 5-10 minutes depending on your internet speed. If the file size is increasing, the data is still downloading.


```bash
$ docker attach dgen_1
(dg3n) dgen@cc6e2e5f70b5:/opt/dgen_os/python$ ls -lh /data/dgen_db.sql
-rw-r--r-- 1 dgen dgen 705M Jan 29 2025 /data/dgen_db.sql
(dg3n) dgen@cc6e2e5f70b5:/opt/dgen_os/python$ python dgen_model.py # Run scenario
(dg3n) dgen@cc6e2e5f70b5:/opt/dgen_os/python$ exit # to exit
```

### Stop running containers
```bash
$ docker ps -a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this block, would it be better/possible to separate the commands from the returned statements?

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
259c30e6b518 docker-postgis "docker-entrypoint.s…" 12 minutes ago Up 12 minutes 0.0.0.0:5432->5432/tcp postgis_1
a775696276eb docker-dgen "bash --login" 12 minutes ago Up 4 seconds dgen_1

$ docker-compose down
[+] Running 3/3
✔ Container dgen_1 Removed 10.1s
✔ Container postgis_1 Removed 0.1s
```

### Warning: This will remove old running containers and data volumes. This may be required if you need space.

```bash
$ docker system prune -a
$ docker volume prune -f
```
39 changes: 39 additions & 0 deletions docker/dgen/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
FROM continuumio/miniconda3

# Setup dgen user
RUN groupadd --gid 999 dgen && useradd --uid 999 --gid dgen --create-home dgen

# Setup Data directory
RUN mkdir -p /data && chmod 755 /data

# Copy dgen files and setup permissions
COPY ./dgen_os/ /opt/dgen_os/
RUN chown -R dgen: /opt/dgen_os /data

# Install dgen
RUN conda env create -f /opt/dgen_os/python/dg3n.yml

# Setup Init script
COPY docker/dgen/init.sh /docker-entrypoint-initdb.d/init-dgen.sh
RUN chmod +x /docker-entrypoint-initdb.d/init-dgen.sh

# Initialize Conda in the Docker environment
RUN cat <<EOF >> ~dgen/.bashrc
if [[ -z \${DGEN_DISABLE_AUTO_START} ]] || [[ ${DGEN_DISABLE_AUTO_START:-0} -eq 0 ]]; then
conda activate dg3n
cd /opt/dgen_os/python/
/docker-entrypoint-initdb.d/init-dgen.sh
fi
EOF

# Change ownership of the bashrc file
RUN chown dgen: ~dgen/.bashrc

# Setup default input_sheet_final.xlsm (Deleware residential)
COPY docker/dgen/input_sheet_final.xlsm /opt/dgen_os/excel/input_sheet_final.xlsm
RUN chmod 755 /opt/dgen_os/excel/input_sheet_final.xlsm && chown dgen: /opt/dgen_os/excel/input_sheet_final.xlsm

# Switch to non-root user
USER dgen

CMD ["bash", "--login"]
23 changes: 23 additions & 0 deletions docker/dgen/init.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/bin/bash
set -e

DB_AGENT_FILE="${DGEN_AGENTFILE:-/data/agent_df_base_revised.pkl}"
FORCE_DELETE_DATABASE=${DGEN_FORCE_DELETE_DATABASE:-0}

# Update the database connection parameters if using a different database name
if [ ! -z "${DATABASE_HOSTNAME}" ]; then
sed -i "s/127.0.0.1/${DATABASE_HOSTNAME}/g" /opt/dgen_os/python/pg_params_connect.json
fi

# Setup Default Input Scenarios
if [[ ! -f /data/input_sheet_final.xlsm ]]; then
cp /opt/dgen_os/excel/input_sheet_final.xlsm /data/input_sheet_final.xlsm
fi

# Setup Input Scenarios
rm -f /opt/dgen_os/input_scenarios/*
ln -s /data/input_sheet_final.xlsm /opt/dgen_os/input_scenarios/input_sheet_final.xlsm

# Setup Input Agent
rm -f /opt/dgen_os/input_agents/*
ln -s ${DB_AGENT_FILE} /opt/dgen_os/input_agents/$(basename "${DB_AGENT_FILE}")
Binary file added docker/dgen/input_sheet_final.xlsm
Binary file not shown.
31 changes: 31 additions & 0 deletions docker/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
services:
dgen:
build:
context: ../
dockerfile: docker/dgen/Dockerfile
stdin_open: true
tty: true
container_name: dgen_1
volumes:
- ~/dgen_data/:/data
environment:
DATABASE_HOSTNAME: postgis
DGEN_DB_USER: postgres
DGEN_DB_NAME: dgen_db
DGEN_DISABLE_AUTO_START: 0 # Set to 1 to Disable dropping into a dgen shell
postgis:
build:
context: ../
dockerfile: docker/postgis/Dockerfile
container_name: postgis_1
ports:
- "127.0.0.1:5432:5432"
volumes:
- ~/dgen_data/:/data
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
DGEN_DATAFILE_URL: https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/dgen_db.sql
DGEN_AGENTFILE_URL: https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/agent_df_base_res_de_revised.pkl
DGEN_FORCE_DELETE_DATABASE: 0 # Set to 1 to drop the database and clear all the data for a fresh load
restart: unless-stopped
20 changes: 20 additions & 0 deletions docker/postgis/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
FROM postgis/postgis:11-3.3

# Setup Data
RUN mkdir -p /data && chmod 755 /data && chown -R postgres: /data/

# Install curl for downloading the data file
RUN apt-get update && apt-get install curl -y

# Setup Init script
COPY docker/postgis/init.sh /docker-entrypoint-initdb.d/init-dgen-pg.sh
RUN chmod +x /docker-entrypoint-initdb.d/init-dgen-pg.sh && chown postgres: /docker-entrypoint-initdb.d/init-dgen-pg.sh

# Switch to Postgres user
USER postgres

# Expose PostgreSQL port
EXPOSE 5432

# Use the default PostgreSQL entrypoint
CMD ["postgres"]
48 changes: 48 additions & 0 deletions docker/postgis/init.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#!/bin/bash
set -e

DB_USER="${DGEN_DB_USER:-postgres}"
DB_NAME="${DGEN_DB_NAME:-dgen_db}"
DB_SQL_FILE="${DGEN_DATAFILE:-/data/dgen_db.sql}"
DB_SQL_FILE_URL="${DGEN_DATAFILE_URL:-https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/dgen_db.sql}"
DB_AGENT_FILE="${DGEN_AGENTFILE:-/data/agent_df_base_revised.pkl}"
DB_AGENT_FILE_URL="${DGEN_AGENTFILE_URL:-https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/agent_df_base_res_de_revised.pkl}"
FORCE_DELETE_DATABASE=${DGEN_FORCE_DELETE_DATABASE:-0}

# Clear database if FORCE_DELETE_DATABASE is enabled
if [[ ${FORCE_DELETE_DATABASE} -eq 1 ]]; then
echo "DGEN_FORCE_DELETE_DATABASE is set to 1. Dropping database '${DB_NAME}' if it exists..."
psql -U "${DB_USER}" -tc "SELECT 1 FROM pg_database WHERE datname = '${DB_NAME}';" | grep -q 1 && \
psql -U "${DB_USER}" -c "DROP DATABASE ${DB_NAME};"
echo "Database '${DB_NAME}' dropped."
rm -f ${DB_SQL_FILE}
echo "Datafile '${DB_SQL_FILE}' removed."
rm -f ${DB_AGENT_FILE}
echo "Datafile '${DB_AGENT_FILE}' removed."
fi

# Check if the data file already exists, download if not
if [[ ! -f ${DB_AGENT_FILE} ]]; then
echo "Downloading data file..."
curl -o ${DB_AGENT_FILE} ${DB_AGENT_FILE_URL}
fi

# Check if the data file already exists, download if not
if [[ ! -f ${DB_SQL_FILE} ]]; then
echo "Downloading data file..."
curl -o ${DB_SQL_FILE} ${DB_SQL_FILE_URL}
fi

# Check if the database already exists
if psql -U "${DB_USER}" -tc "SELECT 1 FROM pg_database WHERE datname = '${DB_NAME}';" | grep -q 1; then
echo "Database '${DB_NAME}' already exists, skipping initialization..."
else
# Create the database
echo "Creating database ${DB_NAME}..."
psql -U ${DB_USER} -c "CREATE DATABASE ${DB_NAME};"

# Load the dataset into the database
echo "Loading data into ${DB_NAME}..."
psql -U ${DB_USER} -d ${DB_NAME} -f ${DB_SQL_FILE}
echo "Database initialization complete!"
fi
71 changes: 71 additions & 0 deletions packer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# dGen Packer AMI Usage Guide

This guide provides instructions on how to use the dGen AWS AMI as well as how to use Packer to build your own AWS AMI.

## dGen AMI Usage

#### Getting Started

Launch an EC2 instance in AWS using the AMI built by Packer. You can then ssh to the instance, by default you will be dropped into a dgen shell.

```bash
$ ssh -i <your_ssh_key> ubuntu@<your_server_ip>
ubuntu@ip-1-2-3-4:~/dgen/docker$ source ~ubuntu/dgen_start.sh
(dg3n) dgen@0b702cabc2ce:/opt/dgen_os/python$ python dgen_model.py
```

The first time running `~ubuntu/dgen_start.sh`, dgen will build the Docker images and download the default dataset. `This may take 10-15 minutes depending on your network connection.`

#### Using a new dataset

Edit the docker-compose file `/home/ubuntu/dgen/docker/docker-compose.yml`. See `using a new dataset` in the [dgen Docker Usage Guide](../docker/README.md).

One challenge you must consider when using an EC2 instance is if the `/data/input_sheet_final.xlsm` needs to be edited, you must copy this file to a system with Excel that can edit the document, then you need to copy it back to the instance.

#### Warning: This will remove old running containers and data volumes. This may be required if you need space.

You can completely remove all the data for a fresh start with the below script and commands. `This will result in loss in your dgen data and provide a fresh start`

```bash
$ ~/dgen_prune_all_data.sh
```

## Building an AWS AMI with Packer

#### Prerequisites

- [Packer](https://www.packer.io/downloads) installed
- AWS account with appropriate permissions to create AMIs
- AWS credentials configured (e.g., using `aws configure`)

#### Packer Init

```bash
$ cd dgen/packer
$ packer init .
```

#### Customize variables and build the AWS AMI

Use Packer to build the AMI. This will create an instance, provision it, and create an AMI from it.

Override variables in example-vars.pkrvars.hcl that are specific for your environment.

```bash
$ cp example-vars.pkrvars.hcl /tmp/dgdo-vars.pkrvars.hcl
$ packer validate -var-file=/tmp/dgdo-vars.pkrvars.hcl dgdo-ami.pkr.hcl
$ packer build -var-file=/tmp/dgdo-vars.pkrvars.hcl dgdo-ami.pkr.hcl
```

## Troubleshooting

If you encounter any issues, refer to the [Packer documentation](https://www.packer.io/docs) or check the error messages for guidance.

## Tests

You can run automated tests on the Packer config using the below test script. It should be ran from the packer directory.

```bash
$ cd packer
$ ./tests/test_packer.sh
```
Loading
Loading