Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SWC gaps in merged data #170

Open
jasonebox opened this issue Oct 11, 2024 · 2 comments
Open

SWC gaps in merged data #170

jasonebox opened this issue Oct 11, 2024 · 2 comments

Comments

@jasonebox
Copy link

I'm responding to a request from Mandfred Stober asking for PDDs from Swiss Camp

Assuming the following is merged data
https://thredds.geus.dk/thredds/catalog/aws/l3sites/netcdf/hour/catalog.html?dataset=aws/l3sites/netcdf/hour/SWC_hour.nc

There is a mostly fillable gap...

It would be convenient to include SWC_O with SWC to have data continuity for air temperature that an intercomparison I made suggests the following offset between SWC and SWC_O for 4629 hours of overlap from 2022-08-02 to 2023-02-14

mean difference -0.20
median difference -0.31 * see below

image

*here is the comparison code

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
from numpy.polynomial.polynomial import polyfit
import xarray as xr


def read_site(site):
    # fn='/Users/jason/0_dat/AWS/1-level/month/'+site+'_month_v04.csv'
    # fn='/Users/jason/0_dat/AWS/1-level/day/'+site+'_day_v04.csv'
    # df=pd.read_csv(fn)
    # df.index = pd.to_datetime(df.time)
    timeframe='hour'
    url=f'https://thredds.geus.dk/thredds/dodsC/aws/l2stations/netcdf/{timeframe}/{site}_{timeframe}.nc' # merged
    url=f'https://thredds.geus.dk/thredds/dodsC/aws_l3_station_netcdf/level_3/{site}/{site}_{timeframe}.nc'
    ds_lazy=xr.open_dataset(url)
    
    # yy=2024
    # ds_loaded = ds_lazy.sel(time=str(yy)).load()
    
    # df = ds_loaded.to_dataframe()
    # df['time']=df.index
    # #%%
    
    ds_loaded = ds_lazy.sel().load()
    
    df = ds_loaded.to_dataframe()
    df['time']=df.index

# df["date"] = pd.to_datetime(df.time)
    return df

SWC_L=read_site('SWC')
SWC_U=read_site('SWC_O')

df=pd.merge(SWC_L,SWC_U, how='inner', left_index=True, right_index=True)
#%%
fig, ax = plt.subplots(figsize=(10,10))
x=df.t_u_x ; y=df.t_u_y
v=np.isfinite(x)&np.isfinite(y)
x=x[v]
y=y[v]
# x=df.gps_alt_x ; y=df.gps_alt_y
plt.scatter(x,y)
b, m = polyfit(x, y, 1)
print(m,b)
ME=np.nanmean(y)-np.nanmean(x)
print("mean difference %.2f" %ME)
MED=np.nanmedian(y)-np.nanmedian(x)
print("median difference %.2f" %MED)
@jasonebox
Copy link
Author

gap fillable from 8 5 2020 onward
image

@BaptisteVandecrux
Copy link
Member

Hi Jason,

Thanks for reporting this.
-> fillable gaps:
This is indeed a current limitation of our merging procedure. When merging, the most recent station always supersedes the older station. When a sensor fails on the new station fails (like SWC), we do switch back to the old station. This is because we do not have yet a smart way of indicating to the user of the merge data that, for a given timestamp, different variables might come from different stations. This is on our to-do list, and the first low hanging fruit would be, when the entire (new) station fails, then we allow to switch back to the old station.

-> SWC and SWC_O:
Right now Swiss Camp (historical), SWC and SWC_O all go into the merged SWC_hour.nc file you use:
https://thredds.geus.dk/thredds/catalog/aws/l3sites/netcdf/hour/catalog.html?dataset=aws/l3sites/netcdf/hour/SWC_hour.nc

In the merged data file, we do not adjust the data as the station is moving or when it is relocated. Users can see station movement as: the coordinates shift slowly throughout the historical period following your reconstruction, then switch to the observation from SWC and eventually jumps to SWC_O when it is being installed.

I do not expect to add this offset to the standard product but I understand that it is important for your use of the data!

Does that answer your questions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants