-
-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add staging environment #3666
Add staging environment #3666
Conversation
Nobody wants to accidentally deploy to prod.
@@ -4,7 +4,8 @@ set -eux | |||
shopt -s nullglob | |||
|
|||
find /data/ -name '*.sqlite' -delete | |||
ls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is helpful for debugging and I figured I'd leave it in.
mv all_dbs.tar.zst /data | ||
zstd -f -d /data/all_dbs.tar.zst -o /data/all_dbs.tar | ||
tar -xf /data/all_dbs.tar --directory /data | ||
datasette serve --host 0.0.0.0 /data/pudl.sqlite /data/ferc*.sqlite /data/censusdp1tract.sqlite --cors --inspect-file inspect-data.json --metadata metadata.yml --setting sql_time_limit_ms 5000 --port $PORT | ||
datasette serve --host 0.0.0.0 ${DATABASES} --cors --inspect-file inspect-data.json --metadata metadata.yml --setting sql_time_limit_ms 5000 --port $PORT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lets us configure the databases & their ordering via env var in Python. Sort of janky, but 🤷
@@ -27,7 +27,8 @@ | |||
|
|||
import click | |||
|
|||
from pudl.helpers import check_tables_have_metadata, create_datasette_metadata_yaml | |||
from pudl.helpers import check_tables_have_metadata | |||
from pudl.metadata.classes import DatasetteMetadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@e-belfer Turns out that create_datasette_metadata_yaml
function that was using this import was:
- only used in one place
- 3 lines long
So I decided to inline it in the one place it's used. One fewer circular import to deal with!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a reference to it in test/integration/datasette_metadata_test.py
(about how we're not using it there) which should get updated/removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you mean in the comment? I can remove that :)
@@ -85,11 +90,17 @@ def inspect_data(datasets: list[str], pudl_output: Path) -> str: | |||
|
|||
@click.command(context_settings={"help_option_names": ["-h", "--help"]}) | |||
@click.option( | |||
"--fly", | |||
"-f", | |||
"--production", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed --fly
to --production
now that we have multiple fly environments.
"deploy", | ||
flag_value="fly", | ||
help="Deploy Datasette to fly.io.", | ||
default=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got rid of the default here to make it harder to accidentally deploy to production.
@@ -27,7 +27,8 @@ | |||
|
|||
import click | |||
|
|||
from pudl.helpers import check_tables_have_metadata, create_datasette_metadata_yaml | |||
from pudl.helpers import check_tables_have_metadata | |||
from pudl.metadata.classes import DatasetteMetadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a reference to it in test/integration/datasette_metadata_test.py
(about how we're not using it there) which should get updated/removed.
processes = ["app"] | ||
|
||
[[vm]] | ||
memory = "1gb" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess 1GB is enough since it's not going to load the whole SQLite DBs into memory, but each of the DBF and XBRL derived Form 1 DBs are close to 1GB (which is kind of wild since one of them covers 27 years, and the other covers 2 years)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's easy enough to change if necessary, but I figured most of the staging stuff is gonna be with small subsets of the data anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slash, it totally worked for my weird little nginx journey today.
Overview
Relates to catalyst-cooperative/pudl-usage-metrics#128
I wanted to set up nginx to get actual IPs from people.
But that seemed sort of risky to do on the live production app, so I wanted to get a staging environment set up so that we can actually test changes to the application setup before having them go live on production.
And also, deploying the whole damn database is quite slow, so I wanted a nice way to deploy only a small SQLite file while I'm futzing with the infrastructure. I had been commenting things out but decided it was time to make a proper CLI option.
So basically it's a yak-shave. But not a super long one.
Testing
Deployed just ferc1/ferc2 xbrls to staging:
Also ran with the full database list, but passed
--build-only
since staging environment isn't sized to handle all the dbs:And saw that the Dockerfile included the right databases:
So that seems fine!
To-do list