diff --git a/README.md b/README.md index d650349..9b2afef 100644 --- a/README.md +++ b/README.md @@ -16,6 +16,7 @@ The Distributed Generation Market Demand (dGen) Model - [Webinar and Setup Tutorial](https://youtu.be/-Te5_KKZR8o) - [Official dGen Documentation](https://nrel.github.io/dgen/) - [Wiki](https://github.com/NREL/dgen/wiki) +- [dGen Docker Usage Guide](./docker/README.md) ## Get Your Tools diff --git a/docker/README.md b/docker/README.md index 3bef20b..78f55a7 100644 --- a/docker/README.md +++ b/docker/README.md @@ -1,29 +1,144 @@ -# Download dataset -https://data.openei.org/submissions/1931 -https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=dgen%2F +# dGen Docker Usage Guide -It defaults to using the Deleware Residential: https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/dgen_db.sql +By default, the dgen container uses the [Deleware residential dataset](https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/dgen_db.sql). -Warning: This will remove old running containers and data, this may be required to free up space and getting started. +You can customize the dataset, see the [Customizing the Dataset](#customizing-the-dataset) section below. -```bash -docker system prune -a -docker volume prune -f -``` +### Mac and Linux quick start -Use docker-compose to run dgen. +This quickstart uses docker-compose to run dgen. The default path to store dgen data files and excel configurations is ~/dgen_data/. This path is shared with your running containers, you can change this path but you will need to edit the `docker-compose.yml` to reflect the data directory of your choice. -~/dgen_data/ is the path to your home directory. This path is shared with your running containers. +Prerequisites assume you are using a Mac and you already installed [Docker Desktop](https://docs.docker.com/desktop/setup/install/mac-install/) +##### Create the data directory ```bash -mkdir ~/dgen_data/ && chmod 755 ~/dgen_data/ +$ mkdir -p ~/dgen_data/ +$ chmod 755 ~/dgen_data/ +$ ls -l ~/dgen_data/ # Its expected to be empty, after starting dgen you will see data files in this location. ``` +##### Startup the dgen containers ``` bash -docker-compose up --build -d +$ cd dgen/docker/ +$ docker-compose up --build -d +[+] Running 2/2 + ✔ Container dgen_1 Started 0.1s + ✔ Container postgis_1 Started 0.0s ``` +##### Connect to the running containers ``` bash -docker attach dgen_1 -(dg3n) root@62bf0bdd2aff:/opt/dgen_os/python# python dgen_model.py +$ docker attach dgen_1 # Attach to dgen environment container +$ docker attach $(sudo docker ps --filter "name=dgen" --format "{{.ID}}") # If dgen_1 is not found +(dg3n) dgen@cc6e2e5f70b5:/opt/dgen_os/python$ python dgen_model.py # Run scenario +(dg3n) dgen@cc6e2e5f70b5:/opt/dgen_os/python$ exit # to exit +$ docker-compose up -d # If you exit, you have to re-up the container if you want to re-attach +``` + +### Troubleshooting common issues + +#### psycopg2.OperationalError: connection to server + +Wait 5-10 minutes for the postgres database to finish starting. + +#### General errors and issues + +Try clearing your `~/dgen_data` and starting over. Make sure to provide time for the datasets to fully download on the re-attempt. + +```bash +$ docker-compose down +$ rm -f ~/dgen_data/* +$ docker system prune -a +$ docker volume prune -f +``` + +### Disabling auto-start for the dgen virtual environment + +By default, logging into the `dgen` container automatically activates the `dgen` virtual environment. For ease of use, its recommended to leave this the default. + +To disable this behavior, edit the `docker-compose.yml` file in this directory and set the following environment variable: + +```yaml +services: + dgen: + environment: + DGEN_DISABLE_AUTO_START: 1 +``` + +### Customizing the dataset + +By default, the dgen container uses the [Deleware residential dataset](https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/dgen_db.sql). + +You can find more datasets using the links below: +- [DGen Dataset Submissions on OpenEI](https://data.openei.org/submissions/1931) +- [DGen Dataset S3 Viewer](https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=dgen%2F) + +You can customize the dataset used by overriding the DGEN_DATAFILE_URL variable in `docker-compose.yml` and then editing `~/dgen_data/input_sheet_final.xlsm` using Excel. + +Below will walk through the process of using the [Colorado residential dataset](https://oedi-data-lake.s3.amazonaws.com/dgen/co_final_db/dgen_db.sql). + +Update the `docker-compose.yml` to use the co_final_db sql download and set the variable to force remove the database. This will result in dataloss from previous runs, if this is a concern, please make backups before proceeding with the below steps. + +```yaml +services: + postgis: + environment: + DGEN_DATAFILE_URL: https://oedi-data-lake.s3.amazonaws.com/dgen/co_final_db/dgen_db.sql + DGEN_AGENTFILE_URL: https://oedi-data-lake.s3.amazonaws.com/dgen/co_final_db/agent_df_base_res_co_revised.pkl + DGEN_FORCE_DELETE_DATABASE: 1 # Clear all the data in the database to reload the Colorado dataset, Warning this will remove your existing data. +``` + +Edit the excel document `~/dgen_data/input_sheet_final.xlsm` using Excel (Enable macros), edit the Region to Analyize to `Colorado` and Markets to `Only Residential`, then click Save Scenario. + +Restart your containers with the above options. This will remove all your existing data and download the new Colorado dataset. + +```bash +$ docker-compose down +[+] Running 3/3 + ✔ Container dgen_1 Removed 9.2s + ✔ Container postgis_1 Removed 0.1s + + $ docker-compose up -d + [+] Running 2/2 + ✔ Container dgen_1 Started 0.1s + ✔ Container postgis_1 Started 0.2s + ``` + +After you load the new Colorado dataset, remove the DGEN_FORCE_DELETE_DATABASE option to prevent future accidental data loss. + +```yaml +services: + postgis: + environment: + DGEN_FORCE_DELETE_DATABASE: 0 +``` + +You can now attach to the dgen container and monitor the data download. This may take 5-10 minutes depending on your internet speed, if the file size is increasing its still downloading. + +```bash +$ docker attach dgen_1 +(dg3n) dgen@cc6e2e5f70b5:/opt/dgen_os/python$ ls -lh /data/dgen_db.sql +-rw-r--r-- 1 dgen dgen 705M Jan 29 2025 /data/dgen_db.sql +(dg3n) dgen@cc6e2e5f70b5:/opt/dgen_os/python$ python dgen_model.py # Run scenario +(dg3n) dgen@cc6e2e5f70b5:/opt/dgen_os/python$ exit # to exit +``` + +### Stop running containers +```bash +$ docker ps -a +CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES +259c30e6b518 docker-postgis "docker-entrypoint.s…" 12 minutes ago Up 12 minutes 0.0.0.0:5432->5432/tcp postgis_1 +a775696276eb docker-dgen "bash --login" 12 minutes ago Up 4 seconds dgen_1 + +$ docker-compose down +[+] Running 3/3 + ✔ Container dgen_1 Removed 10.1s + ✔ Container postgis_1 Removed 0.1s +``` + +### Warning: This will remove old running containers and data volumes. This may be required if you need space. + +```bash +$ docker system prune -a +$ docker volume prune -f ``` \ No newline at end of file diff --git a/docker/dgen/Dockerfile b/docker/dgen/Dockerfile index e3ca24d..731bf4b 100644 --- a/docker/dgen/Dockerfile +++ b/docker/dgen/Dockerfile @@ -1,11 +1,14 @@ FROM continuumio/miniconda3 -# Setup Data -RUN mkdir -p /data -RUN chmod 755 /data +# Setup dgen user +RUN groupadd --gid 1000 dgen && useradd --uid 1000 --gid dgen --create-home dgen -# Copy dgen files +# Setup Data directory +RUN mkdir -p /data && chmod 755 /data + +# Copy dgen files and setup permissions COPY ./dgen_os/ /opt/dgen_os/ +RUN chown -R dgen: /opt/dgen_os /data # Install dgen RUN conda env create -f /opt/dgen_os/python/dg3n.yml @@ -15,8 +18,22 @@ COPY docker/dgen/init.sh /docker-entrypoint-initdb.d/init-dgen.sh RUN chmod +x /docker-entrypoint-initdb.d/init-dgen.sh # Initialize Conda in the Docker environment -RUN echo "conda activate dg3n" >> ~/.bashrc -RUN echo "cd /opt/dgen_os/python/" >> ~/.bashrc -RUN echo "/docker-entrypoint-initdb.d/init-dgen.sh" >> ~/.bashrc +RUN cat <> ~dgen/.bashrc +if [[ -z \${DGEN_DISABLE_AUTO_START} ]] || [[ ${DGEN_DISABLE_AUTO_START:-0} -eq 0 ]]; then + conda activate dg3n + cd /opt/dgen_os/python/ + /docker-entrypoint-initdb.d/init-dgen.sh +fi +EOF + +# Change ownership of the bashrc file +RUN chown dgen: ~dgen/.bashrc + +# Setup default input_sheet_final.xlsm (Deleware residential) +COPY docker/dgen/input_sheet_final.xlsm /opt/dgen_os/excel/input_sheet_final.xlsm +RUN chmod 755 /opt/dgen_os/excel/input_sheet_final.xlsm && chown dgen: /opt/dgen_os/excel/input_sheet_final.xlsm + +# Switch to non-root user +USER dgen CMD ["bash", "--login"] \ No newline at end of file diff --git a/docker/dgen/init.sh b/docker/dgen/init.sh index 7a195b1..1139de4 100644 --- a/docker/dgen/init.sh +++ b/docker/dgen/init.sh @@ -1,16 +1,23 @@ #!/bin/bash set -e +DB_AGENT_FILE="${DGEN_AGENTFILE:-/data/agent_df_base_revised.pkl}" +FORCE_DELETE_DATABASE=${DGEN_FORCE_DELETE_DATABASE:-0} + # Update the database connection parameters if using a different database name if [ ! -z "${DATABASE_HOSTNAME}" ]; then sed -i "s/127.0.0.1/${DATABASE_HOSTNAME}/g" /opt/dgen_os/python/pg_params_connect.json fi +# Setup Default Input Scenarios +if [[ ! -f /data/input_sheet_final.xlsm ]]; then + cp /opt/dgen_os/excel/input_sheet_final.xlsm /data/input_sheet_final.xlsm +fi + # Setup Input Scenarios rm -f /opt/dgen_os/input_scenarios/* -cp /opt/dgen_os/excel/input_sheet_final.xlsm /data/input_sheet_final.xlsm ln -s /data/input_sheet_final.xlsm /opt/dgen_os/input_scenarios/input_sheet_final.xlsm # Setup Input Agent -#rm -f /opt/dgen_os/input_agents/* -#ln -s /data/agent_df_base_res_de_revised.pkl /opt/dgen_os/input_agents/agent_df_base_res_de_revised.pkl \ No newline at end of file +rm -f /opt/dgen_os/input_agents/* +ln -s ${DB_AGENT_FILE} /opt/dgen_os/input_agents/$(basename "${DB_AGENT_FILE}") \ No newline at end of file diff --git a/docker/dgen/input_sheet_final.xlsm b/docker/dgen/input_sheet_final.xlsm new file mode 100644 index 0000000..a90af00 Binary files /dev/null and b/docker/dgen/input_sheet_final.xlsm differ diff --git a/docker/docker-compose.yml b/docker/docker-compose.yml index 355b0d8..4990a3b 100644 --- a/docker/docker-compose.yml +++ b/docker/docker-compose.yml @@ -9,7 +9,10 @@ services: volumes: - ~/dgen_data/:/data environment: - DATABASE_HOSTNAME: postgis_1 + DATABASE_HOSTNAME: postgis + DGEN_DB_USER: postgres + DGEN_DB_NAME: dgen_db + DGEN_DISABLE_AUTO_START: 0 # Set to 1 to Disable dropping into a dgen shell postgis: build: context: ../ @@ -22,4 +25,7 @@ services: environment: POSTGRES_USER: postgres POSTGRES_PASSWORD: postgres + DGEN_DATAFILE_URL: https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/dgen_db.sql + DGEN_AGENTFILE_URL: https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/agent_df_base_res_de_revised.pkl + DGEN_FORCE_DELETE_DATABASE: 0 # Set to 1 to drop the database and clear all the data for a fresh load restart: unless-stopped \ No newline at end of file diff --git a/docker/postgis/Dockerfile b/docker/postgis/Dockerfile index c1f983a..03d4ddd 100644 --- a/docker/postgis/Dockerfile +++ b/docker/postgis/Dockerfile @@ -1,15 +1,17 @@ FROM postgis/postgis:11-3.3 # Setup Data -RUN mkdir -p /data -RUN chmod 755 /data +RUN mkdir -p /data && chmod 755 /data && chown -R postgres: /data/ # Install curl for downloading the data file RUN apt-get update && apt-get install curl -y # Setup Init script COPY docker/postgis/init.sh /docker-entrypoint-initdb.d/init-dgen-pg.sh -RUN chmod +x /docker-entrypoint-initdb.d/init-dgen-pg.sh +RUN chmod +x /docker-entrypoint-initdb.d/init-dgen-pg.sh && chown postgres: /docker-entrypoint-initdb.d/init-dgen-pg.sh + +# Switch to Postgres user +USER postgres # Expose PostgreSQL port EXPOSE 5432 diff --git a/docker/postgis/init.sh b/docker/postgis/init.sh index 9bc1369..f90f46b 100644 --- a/docker/postgis/init.sh +++ b/docker/postgis/init.sh @@ -1,12 +1,33 @@ #!/bin/bash set -e -DB_USER="postgres" -DB_NAME="dgen_db" -DB_SQL_FILE="/data/dgen_db.sql" -DB_SQL_FILE_URL="https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/dgen_db.sql" +DB_USER="${DGEN_DB_USER:-postgres}" +DB_NAME="${DGEN_DB_NAME:-dgen_db}" +DB_SQL_FILE="${DGEN_DATAFILE:-/data/dgen_db.sql}" +DB_SQL_FILE_URL="${DGEN_DATAFILE_URL:-https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/dgen_db.sql}" +DB_AGENT_FILE="${DGEN_AGENTFILE:-/data/agent_df_base_revised.pkl}" +DB_AGENT_FILE_URL="${DGEN_AGENTFILE_URL:-https://oedi-data-lake.s3.amazonaws.com/dgen/de_final_db/agent_df_base_res_de_revised.pkl}" +FORCE_DELETE_DATABASE=${DGEN_FORCE_DELETE_DATABASE:-0} -# Check if the data file already exists +# Clear database if FORCE_DELETE_DATABASE is enabled +if [[ ${FORCE_DELETE_DATABASE} -eq 1 ]]; then + echo "DGEN_FORCE_DELETE_DATABASE is set to 1. Dropping database '${DB_NAME}' if it exists..." + psql -U "${DB_USER}" -tc "SELECT 1 FROM pg_database WHERE datname = '${DB_NAME}';" | grep -q 1 && \ + psql -U "${DB_USER}" -c "DROP DATABASE ${DB_NAME};" + echo "Database '${DB_NAME}' dropped." + rm -f ${DB_SQL_FILE} + echo "Datafile '${DB_SQL_FILE}' removed." + rm -f ${DB_AGENT_FILE} + echo "Datafile '${DB_AGENT_FILE}' removed." +fi + +# Check if the data file already exists, download if not +if [[ ! -f ${DB_AGENT_FILE} ]]; then + echo "Downloading data file..." + curl -o ${DB_AGENT_FILE} ${DB_AGENT_FILE_URL} +fi + +# Check if the data file already exists, download if not if [[ ! -f ${DB_SQL_FILE} ]]; then echo "Downloading data file..." curl -o ${DB_SQL_FILE} ${DB_SQL_FILE_URL} diff --git a/packer/README.md b/packer/README.md index f359afc..7246fdc 100644 --- a/packer/README.md +++ b/packer/README.md @@ -10,32 +10,39 @@ This guide provides instructions on how to use Packer to build an AMI with the p 1. **Clone the repository and Packer init** - ```sh + ```bash git clone https://github.com/your-repo/dgen.git cd dgen/packer packer init . ``` -2. **Validate the Packer template** - - ```sh - packer validate dgdo-ami.pkr.hcl - ``` - -3. **Build the AMI** +2. **Customize Variables and Build the AMI** Use Packer to build the AMI. This will create an instance, provision it, and create an AMI from it. - ```sh - packer build -var-file=example-vars.pkrvars.hcl dgdo-ami.pkr.hcl + Override variables in example-vars.pkrvars.hcl that are specific for your environment. + + ```bash + cp example-vars.pkrvars.hcl ~/dgdo-vars.pkrvars.hcl + packer validate -var-file=~/dgdo-vars.pkrvars.hcl dgdo-ami.pkr.hcl + packer build -var-file=~/dgdo-vars.pkrvars.hcl dgdo-ami.pkr.hcl ``` -## Configuration +## Tests + +You can run automated tests on the Packer config using the below test script. It should be ran from the packer directory. + +```bash +cd packer +./tests/test_packer.sh +``` -The Packer template is configured to use the latest Ubuntu AMI and install Apache2. You can modify the template as needed. +## Usage -- **Region**: The AWS region where the AMI will be created. Default is `us-west-2`. -- **Instance Type**: The instance type used for building the AMI. Default is `t3.micro`. +```bash +ssh -i ubuntu@ +(dg3n) root@0b702babc2ce:/opt/dgen_os/python# python dgen_model.py +``` ## Troubleshooting diff --git a/packer/dgdo-ami.pkr.hcl b/packer/dgdo-ami.pkr.hcl index c027fc4..e815acb 100644 --- a/packer/dgdo-ami.pkr.hcl +++ b/packer/dgdo-ami.pkr.hcl @@ -28,7 +28,8 @@ variable "instance_type" { } variable "ami_users" { - default = "" + type = list(string) + default = [] } variable "subnet_id" { @@ -61,7 +62,7 @@ variable "run_tags" { variable "tags" { type = map(string) default = { - Name = var.ami_name + Name = "dgdo-server" } } @@ -72,13 +73,14 @@ source "amazon-ebs" "dgdo_ami" { name = "ubuntu/images/hvm-ssd-gp3/ubuntu-noble-24.04-amd64-server-*" virtualization-type = "hvm" architecture = "x86_64" + root-device-type = "ebs" } owners = ["099720109477"] # Canonical most_recent = true } instance_type = var.instance_type ssh_username = var.ssh_username - ami_name = var.ami_name + ami_name = "${var.ami_name}-${formatdate("20060102150405", timestamp())}" ami_description = var.ami_description tags = var.tags run_tags = var.run_tags @@ -88,6 +90,7 @@ source "amazon-ebs" "dgdo_ami" { security_group_id = var.security_group_id ami_users = var.ami_users ssh_interface = var.ssh_interface + ssh_pty = "true" } build { @@ -98,11 +101,16 @@ build { destination = "/home/ubuntu/dgen" } + provisioner "file" { + source = "install_dgen.sh" + destination = "/home/ubuntu/install_dgen.sh" + } + provisioner "shell" { inline = [ "sudo apt-get update -y", "sudo apt-get upgrade -y", - "sudo apt-get install -y docker-ce docker-ce-cli containerd.io" + "sudo chmod 755 /home/ubuntu/install_dgen.sh && /home/ubuntu/install_dgen.sh" ] } } diff --git a/packer/example-vars.pkrvars.hcl b/packer/example-vars.pkrvars.hcl new file mode 100644 index 0000000..4bbf705 --- /dev/null +++ b/packer/example-vars.pkrvars.hcl @@ -0,0 +1,17 @@ +ssh_username = "ubuntu" +ami_name = "dgdo-server" +ami_description = "AMI for DGDO server" +aws_region = "us-west-2" +instance_type = "t3.micro" +ami_users = [] # ["all"] would publish to all +subnet_id = "" +vpc_id = "" +associate_public_ip_address = false +ssh_interface = "private_ip" +security_group_id = "" +run_tags = { + Name = "dgdo-server" +} +tags = { + Name = "dgdo-server" +} diff --git a/packer/install_dgen.sh b/packer/install_dgen.sh new file mode 100644 index 0000000..44cb640 --- /dev/null +++ b/packer/install_dgen.sh @@ -0,0 +1,37 @@ +#!/bin/bash + +# Add Docker's official GPG key: +sudo apt-get update +sudo apt-get install ca-certificates curl +sudo install -m 0755 -d /etc/apt/keyrings +sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc +sudo chmod a+r /etc/apt/keyrings/docker.asc + +# Add the repository to Apt sources: +echo \ + "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \ + $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ + sudo tee /etc/apt/sources.list.d/docker.list > /dev/null +sudo apt-get update -y + +# Install Docker Engine +sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin docker-compose -y + +# Install dGEN as part of the user login for ubuntu +sudo usermod -aG docker ubuntu +cat <> ~ubuntu/dgen_prune_all_data.sh +docker system prune -a +docker volume prune -f +EOF +chmod 755 ~ubuntu/dgen_prune_all_data.sh + +cat <> ~ubuntu/dgen_start.sh +mkdir -p ~/dgen_data/ && chmod 755 ~/dgen_data/ +cd ~/dgen/docker/ +docker-compose up --build -d +docker attach $(sudo docker ps --filter "name=dgen" --format "{{.ID}}") +EOF +chmod 755 ~ubuntu/dgen_start.sh + +echo "source ~ubuntu/dgen_start.sh" >> ~ubuntu/.bashrc +chmod 755 ~ubuntu/.bashrc \ No newline at end of file diff --git a/packer/tests/test_packer.sh b/packer/tests/test_packer.sh new file mode 100755 index 0000000..b86040d --- /dev/null +++ b/packer/tests/test_packer.sh @@ -0,0 +1,16 @@ +#!/bin/bash + +# Get the directory of the current script +SCRIPT_DIR=$(dirname "$0") + +# Run packer validate using the relative path to the Packer template +packer validate -var-file=${SCRIPT_DIR}/../example-vars.pkrvars.hcl "${SCRIPT_DIR}/../dgdo-ami.pkr.hcl" + +# Check the exit code of packer validate +if [ $? -ne 0 ]; then + echo "Packer template validation failed." + exit $? +else + echo "Packer template validation succeeded." + exit 0 +fi \ No newline at end of file