Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added initial files for qc #39

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

Conversation

shanice-skylight
Copy link
Collaborator

@shanice-skylight shanice-skylight commented Jan 30, 2025

This PR includes:

  • A docker-compose file to deploy the following containers: database, application, and Portainer.

  • The vs_dump.sql file, which initializes all value sets, concepts, and foreign key mappings extracted from a fresh eRSD pull and processed through our creation functions.

  • Environment variable files for both the application and database containers.

  • A provisioning script (qc-provision.sh) specific to query connector.

  • A Packer file tailored for query connector.

In the future, the separate provisioning and Packer files can be consolidated into a single file, leveraging GitHub workflows to determine which application to build the virtual image for.

@alismx alismx added this to the DIBBs-In-A-Box! milestone Jan 31, 2025
@alismx alismx added the Cloud Enablement Cloud Enablement DevOps label Jan 31, 2025
@shanice-skylight
Copy link
Collaborator Author

work in progress on PR

@shanice-skylight shanice-skylight marked this pull request as ready for review February 20, 2025 04:20
Copy link
Collaborator

@alismx alismx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to run a build from this branch shortly, but I wanted to post my initial questions and comments!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long does the db-creation script take?

Is there another place this dump is stored, like in the query connector repo? If so, we could reference the raw file in that repo.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't take long, the dump is also stored in the dibbs-query-connector

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rin-skylight @shanice-skylight We aren't going to need the dump in this repo because the provision.sh script clones the dibbs-query-connector repo since it's in that repo as linked; we need to make sure the docker file mount source is correct.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alismx Can you reference where in the script it clones dibbs-query-connector, I can't find it?

@@ -0,0 +1,49 @@
services:
# PostgreSQL DB for custom query and value set storage
db:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this required for the build to succeed? Is the database going to be bundled into the VM for partners to launch that way? My understanding was that we expect a partner to launch a database somewhere(RDS), and they would configure the app to connect to that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VM includes a database preloaded with all value sets, concepts, and foreign key mappings extracted from a fresh eRSD pull and processed through our creation functions. We anticipate that partners will eventually connect their own database.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth discussing whether the database is fully necessary for this VM. We're trying to keep the appliance as lightweight as possible, and it has a tiny disk size of only 8 GB. If we expect a partner to connect their own DB, we run the risk of our bundled postgres instance becoming a maintenance and security issue by just hanging out. There's a chance it will be reintroduced by new compose files on a version update...would a partner want that if they've already cut it out of their build?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now, we should bundle it in for partners, but it's certainly possible they will want to use their own.

Copy link
Collaborator

@alismx alismx Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the dump file, and it's a little over 8M, but I don't know how much storage the database will need or how large it will grow. I've included a screenshot of the VM as it started. We'll need to consider storage for this and any other data/logging that is held in the VM.

image

Copy link
Collaborator Author

@shanice-skylight shanice-skylight Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we can determine the final size of this database at the moment. Regarding the ERSD and UMLS codes, they are updated once or twice a year. @nickclyde I suggest treating this as a research or brainstorming topic, but for now, we could leave it as is. Here's the ticket for it.

provisioner "shell" {
only = ["qemu.iso"]
scripts = [
"scripts/provision.sh"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I ran the build it ran the provision script setup for the ecr-viewer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch,thanks

Copy link
Collaborator

@EmmanuelNwa247 EmmanuelNwa247 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might need to change line 90 "scripts/provision.sh" in your qc-ubuntu.pkr.hcl file to reference your new changes "scripts/qc-provision.sh"

Copy link
Collaborator

@EmmanuelNwa247 EmmanuelNwa247 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!!!

Copy link
Collaborator

@rin-skylight rin-skylight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start, but I would like to make sure we don't introduce duplicate code. I think this PR should integrate the variable-based changes that Alis demonstrated for the team in their most recent PR. If this one needs to wait for a bit, let's do that. Getting it done right is more important than speed.

@@ -0,0 +1,49 @@
services:
# PostgreSQL DB for custom query and value set storage
db:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth discussing whether the database is fully necessary for this VM. We're trying to keep the appliance as lightweight as possible, and it has a tiny disk size of only 8 GB. If we expect a partner to connect their own DB, we run the risk of our bundled postgres instance becoming a maintenance and security issue by just hanging out. There's a chance it will be reintroduced by new compose files on a version update...would a partner want that if they've already cut it out of their build?


# Next.js app
query-connector:
image: ghcr.io/cdcgov/dibbs-query-connector/query-connector:latest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than using latest, we will need to pin this to a specific version. This prevents the stack from auto-updating, and allows us to build specific versions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we only tag the packages with latest and main tags so I added a ticket to update the workflow to add incremental versions.

@@ -0,0 +1,2 @@
NODE_ENV=production
DATABASE_URL=postgres://postgres:pw@db:5432/qc_db
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should revisit the DATABASE_URL value if we expect customers to attach their own DB.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently its an option for customers to attach their own DB but its not a requirement.

Copy link
Member

@nickclyde nickclyde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Shanice!!

@@ -0,0 +1,49 @@
services:
# PostgreSQL DB for custom query and value set storage
db:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now, we should bundle it in for partners, but it's certainly possible they will want to use their own.

@shanice-skylight shanice-skylight force-pushed the shanice/query-connector branch from 9ef386a to 8109c6a Compare March 5, 2025 00:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cloud Enablement Cloud Enablement DevOps
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants