Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize wind turbulence calculation #13

Merged
merged 13 commits into from
Sep 12, 2024

Conversation

TannazH
Copy link
Collaborator

@TannazH TannazH commented Aug 30, 2024

Description

The motivation for the pull request is to optimize the calculation of wind turbulence in meteorology class to speed up the calculation by a factor of 100. This is very useful when meteorology data is large.

The function calculate_wind_turbulence_horizontal in the meteorology class is modified and instead of using rolling.apply(scipy.stats.circstd) for a window of choosing, the equation for calculating circstd is used enabling the use of rolling.mean which is 100 times faster. The rolling.apply is like performing a loop over the data and built in rolling methods such as rolling.mean are optimized for the speed.

Following is a piece of code that illustrate the time-efficiency of the proposed method by comparing the time it takes to calculate the horizontal wind turbulence using the circstd and optimized method.

import numpy as np
from scipy.stats import circstd 
import pandas as pd
from pyelq.meteorology import Meteorology
import timeit

n = 10000
window = "10s"
wind_direction = pd.Series(np.random.uniform(0, 360, n))
wind_direction.index = pd.date_range('2018-01-01', periods=n, freq='1s')

### Calculate the wind turbulence using circstd function
start_time_circstd = timeit.default_timer()
wind_turbulence_circstd = wind_direction.rolling(window=window , center=True, min_periods=3).apply(
    circstd, kwargs={"low": 0, "high": 360})
time_elapsed_circstd = timeit.default_timer() - start_time_circstd

### Calculate the wind turbulence using the proposed method 
start_time_optimized = timeit.default_timer()
sin_rolling = (np.sin(wind_direction * np.pi / 180)).rolling(window=window, center=True, min_periods=3).mean()
cos_rolling = (np.cos(wind_direction * np.pi / 180)).rolling(window=window, center=True, min_periods=3).mean()
wind_turbulence_optimized = np.sqrt(-2 * np.log((sin_rolling**2 + cos_rolling**2) ** 0.5)) * 180 / np.pi
time_elapsed_optimized = timeit.default_timer() - start_time_optimized

print("Time took to calculate wind turbulence using circstd function is {} seconds. ".format(time_elapsed_circstd ))
print("Time took to calculate wind turbulence using the optimized approach is {} seconds. ".format(time_elapsed_optimized ))

### Showing that calculating wind turbulence using the optimized approach is 100 faster than using circstd function. 
print("Calculating wind turbulence using the optimized approach is 100 times faster than circstd function: {}".format(time_elapsed_optimized < time_elapsed_circstd/100))

### Showing that the wind turbulence calculated from optimized approach is identical to those obtained from circstd function. 
mean_turbulence = np.mean(wind_turbulence_optimized)
mean_turbulence_circstd = np.mean(wind_turbulence_circstd)
print("Optimized approach for calculating the wind turbulence gives identical results to those from the circstd: {}".format(np.isclose(mean_turbulence, mean_turbulence_circstd)))

Running the above code gives the following results

Time took to calculate wind turbulence using circstd function is 1.6943280000014056 seconds.
Time took to calculate wind turbulence using the optimized approach is 0.0017694999987725168 seconds.
Calculating wind turbulence using the optimized approach is 100 times faster than circstd function: True
Optimized approach for calculating the wind turbulence is identical to those from the circstd: True

Fixes # (issue)
The time-inefficient calculation of the wind turbulence in case of having large meteorology data.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

It has been tested using the above piece of code that the optimized approach for calculating the wind turbulence gives the same results as circstd function (scipy.stats.circstd).
It has been also tested that the proposed change speeds up the calculation of the wind turbulence by a factor of 100.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

TannazH and others added 4 commits August 27, 2024 14:35
…ly efficient instead of using circstd and modify testing to check whether the results of both methods are the same.
…urblance based on the new and old approaches
Signed-off-by: code_reformat <>
@TannazH TannazH requested a review from bvandekerkhof August 30, 2024 11:54
Copy link
Collaborator

@bvandekerkhof bvandekerkhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for implementing.

@bvandekerkhof bvandekerkhof merged commit 5c1af2b into main Sep 12, 2024
10 checks passed
@bvandekerkhof bvandekerkhof deleted the optimize_circstd_implementation branch September 12, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants