Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to efficiently extract US-Wave timeseries dataset for specific spatiotemporal boundary #160

Open
danuriyanto opened this issue Jul 25, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@danuriyanto
Copy link

Bug Description
I wanted to extract the timeseries data of significant wave height (swh) in the specific area of the Atlantic region. What is the most efficient method to perform this task? I tried using the rex-WaveX extraction and was somewhat stuck in the loop with the following traceback. However, if I call the index individually, it passes through the loop and feeds data.

Is there any method to achieve the task? I am thinking about giving the "box" spatial boundary condition and specify the temporal boundary condition to extract all the timeseries.

Full Traceback

ERROR:root:got <class 'requests.exceptions.RetryError'> exception: HTTPSConnectionPool(host='developer.nrel.gov', port=443): Max retries exceeded with url: /api/hsds/datasets/d-3e9a0496-75729c43-eaa5-750126-3f3071?domain=%2Fnrel%2FUS_wave%2FAtlantic%2FAtlantic_wave_1979.h5&api_key=rcufYNIUpUJPlUP5n637v0O03DKEljn5IfyabLFZ (Caused by ResponseError('too many 503 error responses'))
---------------------------------------------------------------------------
MaxRetryError                             Traceback (most recent call last)
File [c:\ProgramData\miniconda3\lib\site-packages\requests\adapters.py:489](file:///C:/ProgramData/miniconda3/lib/site-packages/requests/adapters.py:489), in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    488 if not chunked:
--> 489     resp = conn.urlopen(
    490         method=request.method,
    491         url=url,
    492         body=request.body,
    493         headers=request.headers,
    494         redirect=False,
    495         assert_same_host=False,
    496         preload_content=False,
    497         decode_content=False,
    498         retries=self.max_retries,
    499         timeout=timeout,
    500     )
    502 # Send the request.
    503 else:

File [c:\ProgramData\miniconda3\lib\site-packages\urllib3\connectionpool.py:878](file:///C:/ProgramData/miniconda3/lib/site-packages/urllib3/connectionpool.py:878), in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    877     log.debug("Retry: %s", url)
--> 878     return self.urlopen(
    879         method,
    880         url,
    881         body,
    882         headers,
    883         retries=retries,
    884         redirect=redirect,
    885         assert_same_host=assert_same_host,
    886         timeout=timeout,
    887         pool_timeout=pool_timeout,
    888         release_conn=release_conn,
    889         chunked=chunked,
    890         body_pos=body_pos,
    891         **response_kw
    892     )
    894 return response

File [c:\ProgramData\miniconda3\lib\site-packages\urllib3\connectionpool.py:878](file:///C:/ProgramData/miniconda3/lib/site-packages/urllib3/connectionpool.py:878), in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    877     log.debug("Retry: %s", url)
--> 878     return self.urlopen(
    879         method,
    880         url,
    881         body,
    882         headers,
    883         retries=retries,
    884         redirect=redirect,
    885         assert_same_host=assert_same_host,
    886         timeout=timeout,
    887         pool_timeout=pool_timeout,
    888         release_conn=release_conn,
    889         chunked=chunked,
    890         body_pos=body_pos,
    891         **response_kw
    892     )
    894 return response

    [... skipping similar frames: HTTPConnectionPool.urlopen at line 878 (7 times)]

File [c:\ProgramData\miniconda3\lib\site-packages\urllib3\connectionpool.py:878](file:///C:/ProgramData/miniconda3/lib/site-packages/urllib3/connectionpool.py:878), in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    877     log.debug("Retry: %s", url)
--> 878     return self.urlopen(
    879         method,
    880         url,
    881         body,
    882         headers,
    883         retries=retries,
    884         redirect=redirect,
    885         assert_same_host=assert_same_host,
    886         timeout=timeout,
    887         pool_timeout=pool_timeout,
    888         release_conn=release_conn,
    889         chunked=chunked,
    890         body_pos=body_pos,
    891         **response_kw
    892     )
    894 return response

File [c:\ProgramData\miniconda3\lib\site-packages\urllib3\connectionpool.py:868](file:///C:/ProgramData/miniconda3/lib/site-packages/urllib3/connectionpool.py:868), in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    867 try:
--> 868     retries = retries.increment(method, url, response=response, _pool=self)
    869 except MaxRetryError:

File [c:\ProgramData\miniconda3\lib\site-packages\urllib3\util\retry.py:592](file:///C:/ProgramData/miniconda3/lib/site-packages/urllib3/util/retry.py:592), in Retry.increment(self, method, url, response, error, _pool, _stacktrace)
    591 if new_retry.is_exhausted():
--> 592     raise MaxRetryError(_pool, url, error or ResponseError(cause))
    594 log.debug("Incremented Retry for (url='%s'): %r", url, new_retry)

MaxRetryError: HTTPSConnectionPool(host='developer.nrel.gov', port=443): Max retries exceeded with url: /api/hsds/datasets/d-3e9a0496-75729c43-eaa5-750126-3f3071?domain=%2Fnrel%2FUS_wave%2FAtlantic%2FAtlantic_wave_1979.h5&api_key=rcufYNIUpUJPlUP5n637v0O03DKEljn5IfyabLFZ (Caused by ResponseError('too many 503 error responses'))

During handling of the above exception, another exception occurred:

RetryError                                Traceback (most recent call last)
File [c:\ProgramData\miniconda3\lib\site-packages\h5pyd\_hl\httpconn.py:438](file:///C:/ProgramData/miniconda3/lib/site-packages/h5pyd/_hl/httpconn.py:438), in HttpConn.GET(self, req, format, params, headers, use_cache)
    437 s = self.session
--> 438 rsp = s.get(
    439     self._endpoint + req,
    440     params=params,
    441     headers=headers,
    442     stream=True,
    443     timeout=self._timeout,
    444     verify=self.verifyCert(),
    445 )
    446 self.log.info("status: {}".format(rsp.status_code))

File [c:\ProgramData\miniconda3\lib\site-packages\requests\sessions.py:600](file:///C:/ProgramData/miniconda3/lib/site-packages/requests/sessions.py:600), in Session.get(self, url, **kwargs)
    599 kwargs.setdefault("allow_redirects", True)
--> 600 return self.request("GET", url, **kwargs)

File [c:\ProgramData\miniconda3\lib\site-packages\requests\sessions.py:587](file:///C:/ProgramData/miniconda3/lib/site-packages/requests/sessions.py:587), in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    586 send_kwargs.update(settings)
--> 587 resp = self.send(prep, **send_kwargs)
    589 return resp

File [c:\ProgramData\miniconda3\lib\site-packages\requests\sessions.py:701](file:///C:/ProgramData/miniconda3/lib/site-packages/requests/sessions.py:701), in Session.send(self, request, **kwargs)
    700 # Send the request
--> 701 r = adapter.send(request, **kwargs)
    703 # Total elapsed time of the request (approximately)

File [c:\ProgramData\miniconda3\lib\site-packages\requests\adapters.py:556](file:///C:/ProgramData/miniconda3/lib/site-packages/requests/adapters.py:556), in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    555 if isinstance(e.reason, ResponseError):
--> 556     raise RetryError(e, request=request)
    558 if isinstance(e.reason, _ProxyError):

RetryError: HTTPSConnectionPool(host='developer.nrel.gov', port=443): Max retries exceeded with url: /api/hsds/datasets/d-3e9a0496-75729c43-eaa5-750126-3f3071?domain=%2Fnrel%2FUS_wave%2FAtlantic%2FAtlantic_wave_1979.h5&api_key=rcufYNIUpUJPlUP5n637v0O03DKEljn5IfyabLFZ (Caused by ResponseError('too many 503 error responses'))

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
Cell In[3], line 5
      3 # for i in len(coord_tuples)
      4 for i in range(len(coord_MA)):
----> 5     with WaveX(wave_file, hsds=True) as f:
      6         lat_lon_swh = f.get_lat_lon_df('significant_wave_height', coord_MA[i])
      7     MA_swh[i] = lat_lon_swh 

Cell In[3], line 6
      4 for i in range(len(coord_MA)):
      5     with WaveX(wave_file, hsds=True) as f:
----> 6         lat_lon_swh = f.get_lat_lon_df('significant_wave_height', coord_MA[i])
      7     MA_swh[i] = lat_lon_swh 

File [c:\ProgramData\miniconda3\lib\site-packages\rex\resource_extraction\resource_extraction.py:788](file:///C:/ProgramData/miniconda3/lib/site-packages/rex/resource_extraction/resource_extraction.py:788), in ResourceX.get_lat_lon_df(self, ds_name, lat_lon, check_lat_lon)
    765 def get_lat_lon_df(self, ds_name, lat_lon, check_lat_lon=True):
    766     """
    767     Extract timeseries of site(s) nearest to given lat_lon(s) and return
    768     as a DataFrame
   (...)
    786         Time-series DataFrame for given site(s) and dataset
    787     """
--> 788     gid = self.lat_lon_gid(lat_lon, check_lat_lon=check_lat_lon)
    789     df = self.get_gid_df(ds_name, gid)
    791     return df

File [c:\ProgramData\miniconda3\lib\site-packages\rex\resource_extraction\resource_extraction.py:608](file:///C:/ProgramData/miniconda3/lib/site-packages/rex/resource_extraction/resource_extraction.py:608), in ResourceX.lat_lon_gid(self, lat_lon, check_lat_lon)
    605 dist, gids = self.tree.query(lat_lon)
    607 if check_lat_lon:
--> 608     self._check_lat_lon(lat_lon)
    609     dist_check = dist > self.distance_threshold
    610     if np.any(dist_check):

File [c:\ProgramData\miniconda3\lib\site-packages\rex\resource_extraction\resource_extraction.py:561](file:///C:/ProgramData/miniconda3/lib/site-packages/rex/resource_extraction/resource_extraction.py:561), in ResourceX._check_lat_lon(self, lat_lon)
    552 def _check_lat_lon(self, lat_lon):
    553     """
    554     Check lat lon coordinates against domain
    555 
   (...)
    559         Either a single (lat, lon) pair or series of (lat, lon) pairs
    560     """
--> 561     lat_min, lat_max = np.sort(self.lat_lon[:, 0])[[0, -1]]
    562     lon_min, lon_max = np.sort(self.lat_lon[:, 1])[[0, -1]]
    564     lat = lat_lon[:, 0]

Code Sample

# import
from rex import WaveX
import h5pyd
import pandas as pd
import numpy as np

# reading the coordinate tuple 
coord_MA = pd.read_csv('MA_grid_2km.csv')
coord_MA = [tuple(row) for row in coord_MA.to_records(index=False)]

# list of NREL Hindcast data by year
year = np.arange(1979,2011,1,dtype=int)
wave_file = []
for i in year:
        a =  "/nrel/US_wave/Atlantic/Atlantic_wave_"+str(i)+".h5"
        wave_file.append(a)

MA_swh = pd.DataFrame()

for i in range(len(wave_file)):
    for j in range(len(coord_MA)):
        with WaveX(wave_file[i], hsds=True) as f:
            lat_lon_swh = f.get_lat_lon_df('significant_wave_height', coord_MA[j])
        MA_swh[i] = lat_lon_swh 

To Reproduce
Steps to reproduce the problem behavior

  1. Run the code above

Expected behavior
Expected to have the timeseries of 547 locations from 01/01/1979 to 12/31/2010.

Screenshots
When i=3, error with traceback mentioned above pops up. However if I call this code:

with WaveX(wave_file[1], hsds=True) as f:
    lat_lon_swh = f.get_lat_lon_df('significant_wave_height', coord_MA[3])`

The system returns:
image

System (please complete the following information):

  • OS: Windows

Additional context
Here's the grid file called in the script.

I hope someone can help!
Thank you,
Danu

MA_grid_2km.csv

@danuriyanto danuriyanto added the bug Something isn't working label Jul 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant