Python package to implement data-pipeline to process high-resolution power meter data.
The demand_acep
package implements a data-pipeline. The data-pipeline performs three tasks - Extraction, Transformation and Loading (ETL).
- Extract: The high-resolution (~7 Hz) power meter data for each meter and each channel is read from the NetCDF files to a pandas dataframe.
- Transform: The data is down-sampled to a lower resolution (1 minute default), missing data is filled, individual channel data is combined with other channels to create a dataframe down-sampled, filled dataframe per day per meter, and this dataframe is exported to a csv file. So, we have for each day of data, a csv file for each meter containing the data for all channels at a lower resolution.
- Load: All the down-sampled data is loaded (copied not inserted for speed) on to the timeseries database, TimescaleDB. The data was copied back from the database to perform the data imputation for the missing days and re-copied to create the complete data. The ETL process is summarised in the poster shown below.
All or some steps can be re-used or repeated as desired. Further analysis using the complete data was performed and results have been in presented in the documentation.
pip install demand-acep
This package has only been tested on Linux.
Usage examples and further analysis can be seen in the scripts
folder.
- Extract data to csv: This file shows how to extract data for a data to csv. This read a data for a day, and performs the transformation and creates CSVs for each meter and described before.
- Extract data for multiple days in parallel: This file shows how to use
multi-processing
library in python to extract data for multiple days in parallel. The more cores the system has, the faster the total data can be extracted. - Copy data in parallel to TimescaleDB database: This jupyter notebook shows how to copy the csv files to the database in parallel.
- Perform data imputation for long timescales (days-months): This jupyter notebook shows how to perform data imputation for long timescales, essentially when the data was not downloaded for a particular day, or months.
- Read from database to pandas dataframe: This jupyter notebook shows how to read the data from a postgres (TimescaleDB) database into a dataframe.
The module supports TDD and includes setup for automatic test runner. To begin development, install Python 3.6+ using Anaconda and NodeJS for your platform and then do the following:
- Clone the repository on your machine using
git clone https://github.com/demand-consults/demand_acep
. This will create a copy of this repository on your machine. - Go to the repository folder using
cd demand_acep
. - Get python dependencies using
pip install -r requirements.txt
. - Get the required node modules using
npm install
. Install Grunt globally usingnpm install -g grunt
. This step and Nodejs is only required for automated test running. - In a dedicated terminal window run
grunt
on the command line. This will watch for changes to any of the.py
files in thedemand_acep
folder and run the tests usingpytest
. - Make tests for the functionality you plan to implement in the
tests
folder and add the data needed for tests to thedata
folder located indemand_acep\data
.
doc
folder contains the documentation related to the package. To make changes to the documentation, following workflow is suggested:
- From the root directory of the package, i.e. here, run
grunt doc
. This command watches for changes in the.rst
files in thedoc
folder and runsmake html
. This has the effect of building your documenation on each save. - To view the changes, it is suggested to run a local webserver. This can be done by first installing a webserver with
pip install sauth
, and then running the webserver like so:sauth <username> <password> localhost <port>
from thedoc
folder in a separate terminal window. Specify a username, password and a port number, for example - 8000. Then navigate to: http://localhost:8000 in your web-browser and enter the username and password you set while runningsauth
. The live changes to the documentation can be viewed by navigating to thehtml
folder in thebuild
directory located atdoc\build\html
. - As you make changes to the documentation in the
.rst
files, and re-save them,grunt doc
automatically updates thehtml
folder and changes can be viewed in the browser by refreshing it.
An R package creates diverse plots per day, weekday, month and year for peak demand power consumption of several meters to support this project. These plots lead to benefit-cost analyses and cost saving plots. In addition, this package forecasts peak power demand using ARIMA on a daily and monthly basis. Correlation and a simple regression are also included.
To use ths package, follow the steps:
- Install
devtools
install.packages("devtools")
- Load the package
library(devtools)
- Install this package
demand
install_github("reconjohn/demand")
- Load the package
library(demand)
Now you are all set!
Brief description of demand charge using R package, demand
Using R package demand
, peak demand, correlation, forecast, and demand charge were plotted. Refer to the followings for more details about demonstration of code from demand
package and its results.
- 0.0.1
- Released to ACEP on 06/21/2019.
Chintan Pathak, Yohan Min, Atinuke Ademola-Idowu - [email protected], [email protected], [email protected].
Distributed under the MIT license. See LICENSE
for more information.
- Fork it (https://github.com/demand-consults/demand_acep/fork)
- Create your feature branch (
git checkout -b feature/fooBar
) - Commit your changes (
git commit -am 'Add some fooBar'
) - Push to the branch (
git push origin feature/fooBar
) - Create a new Pull Request