PUDL v2024.5.0
·
197 commits
to main
since this release
What's Changed
New Data
- Update EIA Bulk Electricity archive DOI by @zaneselvans in #3353
- 3313 Q4 2023 eia860 update by @aesharpe in #3367
- Add Q4 CEMS data by @e-belfer in #3379
- Extract raw 923 Schedule 8 A-D by @e-belfer in #3373
- Integrate monthly EIA923 data through November 2023 by @zaneselvans in #3422
- Add EIA Thermoelectric Cooling Water dataset DOI to datastore. by @zaneselvans in #3457
- Transform EIA860 and EIA923 Cooling System Tables by @aesharpe in #3405
- Add manual GridPath RA Toolkit renewable profile data source. by @zaneselvans in #3489
- eia860 solar: extract by @cmgosnell in #3482
- Extract EIA860 Energy Storage tables by @aesharpe in #3488
- NREL ATB axtraction by @cmgosnell in #3498
- Extract EIA 930 data, refactor extractors to handle different date partitions by @e-belfer in #3497
- Extract EIA923 energy storage table by @aesharpe in #3516
- Transform EIA860 Wind by @cmgosnell in #3522
- Transform and harvesting eia860 solar table by @cmgosnell in #3524
- WIP: GridPath RA Toolkit wind and solar generation profiles by @zaneselvans in #3514
- Transform and harvesting the eia860 Energy Storage table by @aesharpe in #3526
- EIA 923 energy storage transform by @aesharpe in #3546
- Extract AEO Table 54, with bonus 13/15/20. by @jdangerx in #3538
- Transform NREL ATB by @cmgosnell in #3570
- EIA-930 initial transform by @zaneselvans in #3584
- Extract Net Summer Electricity Generating Capacity from AEO Table 54 by @jdangerx in #3582
- Update EIA Bulk Electricity archive/DOI. by @zaneselvans in #3615
- Add electric sales transformation. by @jdangerx in #3613
- Add EPA CEMS 2024Q1 by @cmgosnell in #3624
- Q1 2024 eia860m eia923 by @aesharpe in #3625
Other Changes
- Fix (more) v2024.02.03 release issues by @zaneselvans in #3346
- Output Parquet files as well as SQLite in PUDL ETL by @zschira in #3296
- Split monolithic ferc_to_sqlite ops into per-dataset pieces by @rousik in #3098
- Add a simple test coverage check. by @zaneselvans in #3352
- Add a simple pytest coverage check on workflow_dispatch or merge queue by @zaneselvans in #3371
- Provide CodeCov token in pytest workflow. by @zaneselvans in #3374
- Update docs + add release template by @jdangerx in #3361
- Stop using live DB in unit tests!! by @jdangerx in #3377
- Add sec10k metadata to sources by @zschira in #3378
- Force --no-cov in nightly build by @jdangerx in #3382
- Use context managers for opening zipfiles by @bendnorman in #3369
- Update expected row count for EIA tables post 860m quarterly update by @aesharpe in #3380
- Skip batch job if build was skipped as a whole. by @jdangerx in #3390
- Update nightly build script to distribute parquet by @zaneselvans in #3399
- Make an EIA860m Changelog table by @cmgosnell in #3331
- Parametermize adding a column in the FERC1 transform & ensure
_correction
records end up in the calculation compoent table by @cmgosnell in #3409 - Simplify pytest-cov configuration. by @jdangerx in #3391
- Prototype dagster-pandera integration by @jdangerx in #3282
- Fix small plants input table to FERC all plants table by @katie-lamb in #3415
- Standardize process for merging tagged commits into persistent branches automatically by @zaneselvans in #3347
- Restore individual FERC 1 plant output tables. by @zaneselvans in #3417
- Experiment tracking by @zschira in #3289
- Address loose ends in versioned release mechanics by @zaneselvans in #3421
- Close out release notes for PUDL v2024.2.6 by @zaneselvans in #3427
- Fix minor issues that arose in v2024.2.6 release by @zaneselvans in #3432
- Harvest generator operating dates when they're within a year of one another by @e-belfer in #3419
- Add RMI beta access to parquet.catalyst.coop by @jdangerx in #3434
- Add new citations of Catalyst / PUDL by @zaneselvans in #3435
- Add BA codes and EIA sector IDs to EIA-860M changelog table by @zaneselvans in #3442
- Very minor but widespread formatting changes from ruff 0.3.0 by @zaneselvans in #3445
- Get multiple years of EIA 176/191/757A CSV data by @davidmudrauskas in #3402
- Delete unused try/except Excel read-in method in
pudl.extract.excel
by @e-belfer in #3454 - Update pull_request_template.md to improve full ETL instructions by @e-belfer in #3446
- Fix broken links and rendering failure in PR template by @e-belfer in #3458
- Add metadata for ATB, EIA 930 and AEO data by @e-belfer in #3474
- Add PUDL citation for Grid Strategies load growth report. by @zaneselvans in #3483
- Clean EIA 860 and 923 FGD operation and maintenance data by @e-belfer in #3403
- Fix nightly build FK failure by @e-belfer in #3491
- Add logline that tells us more about BadZipFile. by @jdangerx in #3493
- Add total -> subtotal calculation correction & fix hard-coded plant-in-service table name by @cmgosnell in #3450
- Fix indent error in nightly builds by @e-belfer in #3521
- add two new correction records into plant_in_service table by @cmgosnell in #3525
- Ferc1 rate base tag updates by @cmgosnell in #3517
- Schema cleanup by @zaneselvans in #3529
- Refactor etl/init.py to make adding new modules easier. by @jdangerx in #3539
- Attempt to limit
_out_ferc714__hourly_demand_matrix
concurrency by @bendnorman in #3541 - Manage concurrency of high-memory processes by @zaneselvans in #3543
- Tag additional assets as high memory usage by @zaneselvans in #3548
- Rename BA & Utility service territory tables to use conventions by @zaneselvans in #3552
- Pin ferc-xbrl-extractor<1.4 to facilitate frictionless v5 update by @zaneselvans in #3566
- Draft of package-level field encoding, applied to EIA by @zaneselvans in #3558
- Get last non-null value instead of latest XBRL filing. by @jdangerx in #3545
- Update expected row counts for FERC 1 tables by @zaneselvans in #3574
- Create beta access SA's for gridpath and zerolab. by @jdangerx in #3577
- Allow beta service accounts to access Parquet bucket by @jdangerx in #3586
- Speed up nb-output-clear step in pre-commit by @jdangerx in #3591
- Enumerate all AEO table 54 schemas. by @jdangerx in #3588
- Fix quoting in hourly parquet deployment command by @zaneselvans in #3602
- Remove unused resource keys from asset definitions by @zaneselvans in #3603
- Stop ignoring test directory passed to pytest. by @jdangerx in #3610
- Refactor EIA AEO totals checks. by @jdangerx in #3606
- Clean up a couple warnings and remove obsolete materialize script. by @jdangerx in #3608
- End use sectors generation by fuel type. by @jdangerx in #3598
- Always clobber existing outputs in FERC to SQLite conversions by @zaneselvans in #3622
- Update EIA AEO table description units to be consistent with columns. by @zaneselvans in #3626
- NREL ATB - Stop dropping duplicate values before unstacking by @cmgosnell in #3630
- Map new EIA plants and utilities with PUDL IDs for 2024Q1 update by @cmgosnell in #3636
- Breakdown total
utility_type
and partialin_rate_base
in rate base table by @cmgosnell in #3532 - Update expected MCOE row counts by @zaneselvans in #3638
- Add template that includes overview/success criteria/tasks by @jdangerx in #3640
- Publish FERC1 Rate Base Table by @cmgosnell in #3641
- Rate base category tweaks by @aesharpe in #3647
- Organize the large new data section of release notes by @zaneselvans in #3652
Full Changelog: v2024.02.04...v2024.5.0