Add staging environment #3666

jdangerx · 2024-06-12T16:14:11Z

Overview

Relates to catalyst-cooperative/pudl-usage-metrics#128

I wanted to set up nginx to get actual IPs from people.

But that seemed sort of risky to do on the live production app, so I wanted to get a staging environment set up so that we can actually test changes to the application setup before having them go live on production.

And also, deploying the whole damn database is quite slow, so I wanted a nice way to deploy only a small SQLite file while I'm futzing with the infrastructure. I had been commenting things out but decided it was time to make a proper CLI option.

So basically it's a yak-shave. But not a super long one.

Testing

Deployed just ferc1/ferc2 xbrls to staging:

$ python publish.py --staging --only ferc1_xbrl.sqlite --only ferc2_xbrl.sqlite

Also ran with the full database list, but passed --build-only since staging environment isn't sized to handle all the dbs:

$ python publish.py --staging -- --build-only

And saw that the Dockerfile included the right databases:

ENV DATABASES '/data/pudl.sqlite /data/ferc1_dbf.sqlite /data/ferc1_xbrl.sqlite /data/ferc2_dbf.sqlite /data/ferc2_xbrl.sqlite /data/ferc60_dbf.sqlite /data/ferc60_xbrl.sqlite /data/ferc6_dbf.sqlite /data/ferc6_xbrl.sqlite /data/ferc714_xbrl.sqlite /data/censusdp1tract.sqlite'

So that seems fine!

To-do list

Give feedback

Review the PR yourself and call out any questions or issues you have
Options

…es from CLI

Nobody wants to accidentally deploy to prod.

jdangerx · 2024-06-12T16:14:46Z

devtools/datasette/fly/run.sh

@@ -4,7 +4,8 @@ set -eux
 shopt -s nullglob

 find /data/ -name '*.sqlite' -delete
+ls


This is helpful for debugging and I figured I'd leave it in.

jdangerx · 2024-06-12T16:15:09Z

devtools/datasette/fly/run.sh

 mv all_dbs.tar.zst /data
 zstd -f -d /data/all_dbs.tar.zst -o /data/all_dbs.tar
 tar -xf /data/all_dbs.tar --directory /data
-datasette serve --host 0.0.0.0 /data/pudl.sqlite /data/ferc*.sqlite /data/censusdp1tract.sqlite --cors --inspect-file inspect-data.json --metadata metadata.yml --setting sql_time_limit_ms 5000 --port $PORT
+datasette serve --host 0.0.0.0 ${DATABASES} --cors --inspect-file inspect-data.json --metadata metadata.yml --setting sql_time_limit_ms 5000 --port $PORT


This lets us configure the databases & their ordering via env var in Python. Sort of janky, but 🤷

jdangerx · 2024-06-12T16:15:57Z

devtools/datasette/publish.py

@@ -27,7 +27,8 @@

 import click

-from pudl.helpers import check_tables_have_metadata, create_datasette_metadata_yaml
+from pudl.helpers import check_tables_have_metadata
+from pudl.metadata.classes import DatasetteMetadata


@e-belfer Turns out that create_datasette_metadata_yaml function that was using this import was:

only used in one place

3 lines long

So I decided to inline it in the one place it's used. One fewer circular import to deal with!

There's a reference to it in test/integration/datasette_metadata_test.py (about how we're not using it there) which should get updated/removed.

Ah, you mean in the comment? I can remove that :)

jdangerx · 2024-06-12T16:16:44Z

devtools/datasette/publish.py

@@ -85,11 +90,17 @@ def inspect_data(datasets: list[str], pudl_output: Path) -> str:

 @click.command(context_settings={"help_option_names": ["-h", "--help"]})
 @click.option(
-    "--fly",
-    "-f",
+    "--production",


Changed --fly to --production now that we have multiple fly environments.

jdangerx · 2024-06-12T16:19:45Z

devtools/datasette/publish.py

    "deploy",
-    flag_value="fly",
-    help="Deploy Datasette to fly.io.",
-    default=True,


Got rid of the default here to make it harder to accidentally deploy to production.

zaneselvans · 2024-06-12T20:02:41Z

devtools/datasette/publish.py

@@ -27,7 +27,8 @@

 import click

-from pudl.helpers import check_tables_have_metadata, create_datasette_metadata_yaml
+from pudl.helpers import check_tables_have_metadata
+from pudl.metadata.classes import DatasetteMetadata


There's a reference to it in test/integration/datasette_metadata_test.py (about how we're not using it there) which should get updated/removed.

zaneselvans · 2024-06-12T20:07:27Z

devtools/datasette/fly/staging.toml

+  processes = ["app"]
+
+[[vm]]
+  memory = "1gb"


I guess 1GB is enough since it's not going to load the whole SQLite DBs into memory, but each of the DBF and XBRL derived Form 1 DBs are close to 1GB (which is kind of wild since one of them covers 27 years, and the other covers 2 years)

It's easy enough to change if necessary, but I figured most of the staging stuff is gonna be with small subsets of the data anyways.

Slash, it totally worked for my weird little nginx journey today.

Add staging environment and the ability to selectively deploy databas…

7a2b610

…es from CLI

jdangerx requested review from zaneselvans and bendnorman June 12, 2024 16:14

Remove default environment.

0f22a34

Nobody wants to accidentally deploy to prod.

jdangerx commented Jun 12, 2024

View reviewed changes

zaneselvans approved these changes Jun 12, 2024

View reviewed changes

Remove comment reference to removed function.

3d9ec88

jdangerx enabled auto-merge June 12, 2024 20:49

jdangerx added this pull request to the merge queue Jun 12, 2024

Merged via the queue into main with commit 5d04952 Jun 12, 2024
12 checks passed

jdangerx deleted the 3664-fly-staging-environment branch June 12, 2024 22:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add staging environment #3666

Add staging environment #3666

jdangerx commented Jun 12, 2024

To-do list

jdangerx Jun 12, 2024

jdangerx Jun 12, 2024

jdangerx Jun 12, 2024

zaneselvans Jun 12, 2024

jdangerx Jun 12, 2024

jdangerx Jun 12, 2024

jdangerx Jun 12, 2024

zaneselvans Jun 12, 2024

zaneselvans Jun 12, 2024

jdangerx Jun 12, 2024

jdangerx Jun 12, 2024

Add staging environment #3666

Add staging environment #3666

Conversation

jdangerx commented Jun 12, 2024

Overview

Testing

To-do list

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment