diff --git a/paper.md b/paper.md index 945caf5..4218e68 100644 --- a/paper.md +++ b/paper.md @@ -43,13 +43,14 @@ The purpose of `pvOps` is to support empirical evaluations of data collected in # Statement of Need -Continued interest in PV deployment across the world has resulted in increased awareness of needs associated with managing reliability and performance of these systems during operation. Current open-source packages for PV analysis (see [openpvtools](https://openpvtools.readthedocs.io/en/latest/) for a list) focus on theoretical evaluations of solar power simulations (e.g., `pvlib`; [@holmgren2018pvlib]), data cleaning and feature development for production data (e.g., `pvanalytics`; [@perry2022pvanalytics]), specific use cases of empirical evaluations (e.g., `RdTools`; [@deceglie2018rdtools] and `Pecos`; [@klise2016performance] for degradation analysis), or analysis of electroluminescene images (e.g., `PVimage`; [@pierce2020identifying]). However, a general package that can support data-driven, exploratory evaluations of diverse field collected information is currently lacking. For example, a maintenance log that describes an inverter failure may be temporally correlated to a dip in production levels. Identifying these relationships improve understanding of the impacts of certain types of failures on a PV plant. To address this gap, we present `pvOps`, an open-source, Python package that can be used by researchers and industry analysts alike to evaluate and extract insights from different types of data routinely collected during PV field operations. +Continued interest in PV deployment across the world has resulted in increased awareness of needs associated with managing reliability and performance of these systems during operation. Current open-source packages for PV analysis (see [openpvtools](https://openpvtools.readthedocs.io/en/latest/) for a list) focus on theoretical evaluations of solar power simulations (e.g., `pvlib`; [@holmgren2018pvlib]), data cleaning and feature development for production data (e.g., `pvanalytics`; [@perry2022pvanalytics]), specific use cases of empirical evaluations (e.g., `RdTools`; [@deceglie2018rdtools] and `Pecos`; [@klise2016performance] for degradation analysis), or analysis of electroluminescene images (e.g., `PVimage`; [@pierce2020identifying]); see XX for a list of open source PV packages that are currently available in PV. However, a general package that can support data-driven, exploratory evaluations of diverse field collected information is currently lacking. For example, a maintenance log that describes an inverter failure may be temporally correlated to a dip in production levels. Identifying such relationships across different types of field data can improve understanding of the impacts of certain types of failures on a PV plant. To address this gap, we present `pvOps`, an open-source, Python package that can be used by researchers and industry analysts alike to evaluate and extract insights from different types of data routinely collected during PV field operations. PV data collected in the field varies greatly in structure (i.e., timeseries and text records) and quality (i.e., completeness and consistency). The data available for analysis is frequently semi-structured. Furthermore, the level of detail collected between different owners/operators might vary. For example, some may capture a general start and end time for an associated event whereas others might include additional time details for different resolution activities. This diversity in data types and structures often leads to data being under-utilized due to the amount of manual processing required. To address these issues, `pvOps` provides a suite of data processing, cleaning, and visualization methods to leverage insights across a broad range of data types, including operations and maintenance records, production timeseries, and IV curves. The functions within `pvOps` enable users to better parse available data to understand patterns in outages and production losses. # Package Overview The following table summarizes the four modules within `pvOps` by presenting: the type of data they analyze, example data features, and highlights of relevant functions. +\textbf{Table 1. Summary of modules and functions within `pvOps`} Module | Type of data | Example data features | Highlights of functions ------- | ------ | --------- | ----------- text | O&M records | *timestamps*, *issue description*, *issue classification* | fill data gaps in dates and categorical records, visualize word clusters and patterns over time @@ -71,7 +72,7 @@ The `pvOps` functionality and documentation continues to be improved and updated -KLB: Writing - Original Draft, Software - Software Development; TG: Conceptualization, Writing - Original Draft; MWH: Writing - Review & Editing, Software - Software Development; HM: Writing - Review & Editing, Software - Software Development; NDJ: Conceptualization, Funding Acquisition, Project Administration, Supervision, Writing - Review & Editing. +KLB: Writing - Original Draft, Software - Software Development, Software - Testing; TG: Conceptualization, Writing - Original Draft, Software - Design; MWH: Writing - Review & Editing, Software - Software Development; HM: Writing - Review & Editing, Software - Software Development; NDJ: Conceptualization, Funding Acquisition, Project Administration, Supervision, Writing - Review & Editing. # Acknowledgements This material is supported by the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy - Solar Energy Technologies Office. Sandia National Laboratories, a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly owned subsidiary of Honeywell International Inc. for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525.