Releases: catalyst-cooperative/pudl
Releases · catalyst-cooperative/pudl
PUDL 0.5.0
Update to include 2020 annual data
See the more extensive release notes in our documentation.
Merged Pull Requests
- make generation allocation output mirror the standard generation table. by @cmgosnell in #1134
- Dependency and data release script updates by @zaneselvans in #1135
- Dependencies by @zaneselvans in #1150
- End of sprint merge of dev by @zaneselvans in #1158
- Epic template. by @bendnorman in #1164
- Hourly state demand by @zaneselvans in #1175
- EIA860 2001-2003 by @bendnorman in #1122
- Redesign metadata and harvest process by @ezwelty in #806
- Basic epa cems output by @TrentonBush in #1227
- Map small gen pudl ids by @aesharpe in #1231
- Dev PR for sprint ending 2021-09-24 by @zaneselvans in #1228
- Build all generated documentation dynamically by @zaneselvans in #1235
- Update dependencies, mostly related to testing, plus sklearn 1.0. by @zaneselvans in #1236
- Eliminate null values in generation_eia923 primary key fields by @bendnorman in #1248
- Drop rows with null generator_id in ownership_eia860 by @zaneselvans in #1258
- Add FERC1 output table that combines key FERC1 subtables by @aesharpe in #1209
- Deduplicate and re-organize metadata from constants.py by @zaneselvans in #1230
- Fix utility_id_eia issues in ownership & plants tables by @zaneselvans in #1268
- remove the data package cruft by @cmgosnell in #1267
- Updated xlsx_maps for eia860 2020 data by @bendnorman in #1273
- 2020 ferc1 by @aesharpe in #1274
- Defer validation of PudlTabl datastore to eia861/ferc714 ETL methods by @zaneselvans in #1275
- 2020 Harvest and load by @bendnorman in #1277
- Crosswalk analysis by @TrentonBush in #1256
- Beginnings of a PUDL bibliography by @zaneselvans in #1294
- add plant_id_pudl to small generators field by @aesharpe in #1293
- Deduplicate natural key fields of generation_fuel_eia923 by @bendnorman in #1296
- Integrate 2020 data for ferc1, eia860, eia923 by @zaneselvans in #1297
- Respond to CG's PR comments. Mostly docs. by @zaneselvans in #1308
- EIA-861 FERC-714 2020 by @zaneselvans in #1309
- Boiler fuel duplicate aggregation by @TrentonBush in #1306
- Fix errors with EIA861 output tables by @aesharpe in #1312
- Add missing output tables to EIA861 by @aesharpe in #1313
- 2020 Data Integration by @cmgosnell in #1255
- Update to flake8 v4.0; always install pudl for Tox by @zaneselvans in #1322
- Use pydantic for ETL settings validation by @bendnorman in #1292
- Update generation_fuel_eia923 documentation with nuclear unit change. by @bendnorman in #1323
- fix pandas API deprecation (issue #1173) by @TrentonBush in #1332
- Static metadata tables and automatic recoding by @zaneselvans in #1272
- Validate v0.5.0 by @zaneselvans in #1345
- PUDL v0.5.0 release candidate by @zaneselvans in #1334
New Contributors
- @bendnorman made their first contribution in #1164 🎉
Full Changelog: v0.4.0...v0.5.0
PUDL 0.4.0
This is our first release in more than a year and a half, and it contains lots of new data and analyses (and breaking changes...) but it doesn't yet include 2020 datasets for FERC and EIA.
See the complete v0.4.0 release notes for details.
Merged Pull Requests
- Unified logic for excel extraction by @rousik in #566
- fuel cost output to ref 860 generators. by @cmgosnell in #574
- Ferc714 by @yashkumar1803 in #594
- Ei mcoe by @aesharpe in #592
- Transform function for distribution systems and other edits by @aesharpe in #643
- Add manually compiled balancing authority id fixes by @zaneselvans in #646
- Transform function for AMI EIA861 by @aesharpe in #647
- Transform function for EIA 861 Dynamic Pricing Table by @aesharpe in #649
- Normalize the Balancing Authority Table and add a BA Association Table by @zaneselvans in #651
- Transform func for Eia861 Green Pricing table by @aesharpe in #653
- Net metering table eia861 by @aesharpe in #671
- Service territories by @zaneselvans in #670
- Non net metering function eia861 by @aesharpe in #680
- Categorize eia codes with either Util or BA priority by @zaneselvans in #687
- Add a new FERC 714 Output Module by @zaneselvans in #699
- 635: Datastore passes travis tests by @ptvirgo in #701
- Operational data table eia861 by @aesharpe in #691
- Add limit_by_state option to utility territory generation by @zaneselvans in #707
- Simplify datapkg_to_sqlite script by @zaneselvans in #712
- Clobber datapackage bundles not single datapackages by @zaneselvans in #714
- Reliability and utility data eia861 by @aesharpe in #710
- Datastore improvements by @ptvirgo in #715
- Distributed generation eia861 by @aesharpe in #724
- Set up GitHub Actions to run Tox/PyTest by @zaneselvans in #727
- Restore utility_assn() and other code wiped out by PR 724 by @zaneselvans in #730
- Energy efficiency eia861 by @aesharpe in #731
- Demand mapping by @yashkumar1803 in #717
- Ferc714 by @ptvirgo in #733
- Demand side management eia861 by @aesharpe in #732
- Some tweaks to table columns and data types by @aesharpe in #743
- get_census2010_gdf uses datastore by @zaneselvans in #764
- Datastore data package validation and updated DOIs by @zaneselvans in #761
- More robust flake8 linting by @zaneselvans in #768
- Validate new dois by @cmgosnell in #773
- Draft of ferc1 + eia860 + eia923 data integration for 2019 by @zaneselvans in #788
- Merge Sprint25 into dev branch by @zaneselvans in #800
- Add DOIs for production archives on Zenodo by @zaneselvans in #804
- Zipcode fix by @aesharpe in #820
- Better help messages and default to verbose logging by @zaneselvans in #825
- Add docker build scripts by @rousik in #826
- Fix few issues surfaced in the previous PR by @rousik in #827
- Automate docker image builds by @rousik in #829
- Bump build-push-action to @v2 and fix arguments. by @rousik in #831
- Draft documentation framework for data sources by @aesharpe in #821
- Eia epa crosswalk by @aesharpe in #822
- Integrate EIA-860 2008 data by @aesharpe in #838
- Integrate EIA 860 M into ETL by @cmgosnell in #824
- Add basic Datasette metadata and deployment script by @zaneselvans in #841
- Notebook land: intro notebooks for CEMS and output tables by @aesharpe in #823
- add output and access notebooks by @cmgosnell in #844
- Allocate generation_fuel_eia923 table data to generators by @cmgosnell in #785
- Notebook land by @zaneselvans in #853
- Ensure deterministic checksums on csv.gz outputs by @rousik in #856
- Add output methods for all remaining EIA 861 tables. by @zaneselvans in #862
- EIA860 old years (through 2004) by @aesharpe in #849
- Add high-performance timeseries anomaly detection and imputation module by @ezwelty in #871
- Speed up FERC 714 hourly demand transform by @ezwelty in #873
- Always run interim ETL tests b/c they're fast now. by @zaneselvans in #874
- Alaska is a thing by @rousik in #876
- Specify min/max versions for all dependencies in setup.py by @zaneselvans in #875
- Fix broken links in README by @kyleries in #864
- Datastore refactoring by @rousik in #880
- Add unit test environment that runs quick tests under src/pudl by @rousik in #867
- Regex future warning by @rousik in #883
- Adjust FERC 714 service territories by using modified versions of EIA 861 tables by @ezwelty in #881
- Timeseries unittest by @zaneselvans in #885
- Bugfixes for states=[ALL] and SQLite DB clobber check by @zaneselvans in #890
- Consolidate interim ETL / output tests by @zaneselvans in #892
- Jupyterhub beta by @zaneselvans in #894
- Clean up PyTest config, coverage generation, unit tests by @zaneselvans in #896
- Implementation of DataFrameCollection by @rousik in #887
- Sprint29 by @zaneselvans in #897
- Dev by @zaneselvans in #898
- Pyarrow v3 by @zaneselvans in #912
- Eia860 validation by @aesharpe in #911
- Improvements to the DataFrameCollection by @rousik in #916
- pudl_datastore --list-partitions by @rousik in #925
- Pudl rmi by @zaneselvans in #926
- Pytest scripts by @zaneselvans in #913
- Sprint30 by @zaneselvans in #933
- Integrate EIA-860m through Nov. 2020 + fixed PUDL Plant IDs by @zaneselvans in #934
- Update PUDL Development Docs by @zaneselvans in #940
- Metadata docs by @aesharpe in #907
- Update transform documentation by @aesharpe in #939
- Convert Census DP1 to SQLite by @zaneselvans in #948
- Sprint31 by @zaneselvans in #951
- Databeta by @zaneselvans in #956
- Dev by @zaneselvans in #957
- Dev docs setup updates by @cmgosnell in https://github.com/catalyst-cooperative/pudl...
v0.3.2: Integration of EIA 860 data for 2009-2010
The primary changes in this release:
- The 2009-2010 data for EIA 860 have been integrated, including updates
to the data validation test cases. - Output tables are more uniform and less restrictive in what they
include, no longer requiring PUDL Plant & Utility IDs in some tables. - This release was used to compile v1.1.0 of the PUDL Data Release, which is archived at Zenodo under this DOI: https://doi.org/10.5281/zenodo.3672068
With this release, the EIA 860 & 923 data now (finally!) cover the same span of time. We do not anticipate integrating any older EIA 860 or 923 data at this time.
v0.3.1: Bug fixes required for PUDL data release
A couple of minor bugs were found in the preparation of the first PUDL data release: * No maximum version of Python was being specified in setup.py. PUDL currently only works on Python 3.7, not 3.8. * epacems_to_parquet conversion script was erroneously attempting to verify the availability of raw input data files, despite the fact that it now relies on the packaged post-ETL epacems data. Didn't catch this before since it was always being run in a context where the original data was lying around... but that's not the case when someone just downloads the released data packages and tries to load them.
v0.3.0: 2020 Q1 PUDL release in support of data archiving
This release is mostly about getting the infrastructure in place to do regular data releases via Zenodo, and updating ETL with 2018 data. Added lots of data validation / quality assurance test cases in anticipation of archiving data. See the pudl.validate module for more details. New data since v0.2.0 of PUDL: * EIA Form 860 for 2018 * EIA Form 923 for 2018 * FERC Form 1 for 1994-2003 and 2018 (select tables) We removed the FERC Form 1 accumulated depreciation table from PUDL because it requires detailed row-mapping in order to be accurate across all the years. It and many other FERC tables will be integrated soon, using new row-mapping methods. Lots of new plants and utilities integrated into the PUDL ID mapping process, for the earlier years (1994-2003). All years of FERC 1 data should be integrated for all future ferc1 tables. Command line interfaces of some of the ETL scripts have changed, see their help messages for details.
v0.2.0: Data package based output without PostgreSQL
This is the first release of PUDL to generate data packages as the canonical output, rather than loading data into a local PostgreSQL database. The data packages can then be used to generate a local SQLite database, without relying on any software being installed outside of the Python requirements specified for the catalyst.coop package. This change will enable easier installation of PUDL, as well as archiving and bulk distribution of the data products in a platform independent format.
v0.1.0: Reference release of PUDL using PostgreSQL
This is the only release of PUDL that will be made that makes use of PostgreSQL. It is provided for reference, in case there are users relying on this setup who need access to a well defined release.
v0.1.0rc1: RC1 for PostgreSQL based legacy PUDL release
v0.1 of PUDL will be the only release we make of the PostgreSQL based system, mainly for archival / reference purposes, in case there are users who need to be able to install this version to support their existing systems while transitioning to the datapackage / SQLite version, which will be released within the next couple of days as v0.2
v0.1.0a4: Fixed Windows console UTF-8 encoding issue.
Output from the pudl_setup script was generating an unprintable character on the Windows console, causing a UnicodeEncodingError, revealed by the conda-forge tests. This release fixes that output.
v0.1.0a3: Updated/simplified specification of dependencies.
The previous release used extras_require in setup.py to specify a couple of packages required for dealing with parquet files. This prevented the parquet functionality from being available after a "vanilla" pip install of the package, which was confusing for test users. The parquet packages have now been added to install_requires, and compilation issues were worked around by making their installation conditional in the readthedocs build.