Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature 342 tcdiag tcmpr plotter #446

Merged
merged 80 commits into from
May 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
115f803
Issue #383 modifications to support multiple plot types, list_stat_1 …
bikegeek Feb 17, 2024
e0a899c
Merge branch 'develop' of https://github.com/dtcenter/METplotpy into …
bikegeek Feb 17, 2024
9001eb9
issue #383 modifications to support generating more than one plot type
bikegeek Feb 17, 2024
992df46
issue #383 provide supporting plotting multiple plot types with singl…
bikegeek Feb 19, 2024
3371f46
issue #383 provide supporting plotting multiple plot types with singl…
bikegeek Feb 19, 2024
505fcbb
issue #383 provide supporting plotting multiple plot types with singl…
bikegeek Feb 19, 2024
b4268b3
clean up comments
bikegeek Feb 19, 2024
707e305
issue #383 provide supporting plotting multiple plot types with singl…
bikegeek Feb 19, 2024
87b980e
issue #383 provide supporting plotting multiple plot types with singl…
bikegeek Feb 19, 2024
532dcc0
issue #383 provide supporting plotting multiple plot types with singl…
bikegeek Feb 19, 2024
9cb0476
issue #383 provide supporting plotting multiple plot types with singl…
bikegeek Feb 19, 2024
375c429
fix colors and legends
bikegeek Feb 19, 2024
df9bd8b
issue #383 add logging to replace printing to stdout
bikegeek Mar 7, 2024
7a366d4
issue #383 add logging support
bikegeek Mar 7, 2024
14b15a9
issue #383 added logging support
bikegeek Mar 7, 2024
88029a6
issue #383 modifications to support plotting multiple plot types usin…
bikegeek Mar 7, 2024
b9e588c
issue #383 modifications to support multiple plot types in one config…
bikegeek Mar 7, 2024
4a704cd
issue #383 modifications to support multiple plot types in a single c…
bikegeek Mar 7, 2024
9ee97c3
issue #383 modifications to support multiple plot types in one config…
bikegeek Mar 7, 2024
3352da0
issue #383 replace print statements with logging
bikegeek Mar 7, 2024
2ebe3be
Issue #383 logging support
bikegeek Mar 7, 2024
433fd62
issue #383 modifications to support multiple plot types in a single c…
bikegeek Mar 7, 2024
4114269
issue #383 modifications to support plotting multiple plot types defi…
bikegeek Mar 7, 2024
0b68302
issue #383 modifications to support multiple plot types in a single c…
bikegeek Mar 7, 2024
26caa22
issue #383 add more checking and modifications to support multiple pl…
bikegeek Mar 7, 2024
54c07c6
issue #383 modifications to support plotting multiple plot types defi…
bikegeek Mar 7, 2024
d83921d
issue #383 update the plot name to include the plot type
bikegeek Mar 7, 2024
dc2be9a
issue #383 TCMPR plot documentation initial content
bikegeek Mar 12, 2024
55e23d2
changed file permissions
bikegeek Mar 12, 2024
0e53776
Added tcmpr_plots to the Table of Contents
bikegeek Mar 12, 2024
57a78cb
issue #383 plot images added for TCMPR plotter
bikegeek Mar 13, 2024
b1b4061
Updated plot to match updates to User's Guide
bikegeek Mar 14, 2024
0d7ca04
updates to match User's Guide
bikegeek Mar 14, 2024
b58ca46
Update the config files for box plot and relperf plots to create only…
bikegeek Mar 14, 2024
7e5c7da
Delete docs/Users_Guide/figure/RELPERF_SAMPLE_DATA_ABS(AMAX_WIND-BMAX…
bikegeek Mar 14, 2024
4ae307e
Delete docs/Users_Guide/figure/BOXPLOT_SAMPLE_DATA_ABS(AMAX_WIND-BMAX…
bikegeek Mar 14, 2024
f8ee0f0
sample data for TCMPR plotter
bikegeek Mar 14, 2024
3329bee
issue #383 config for all seven plot types
bikegeek Mar 14, 2024
a91bf1e
issue #383 removed unused figure, replaced with TK_ERR figure for box…
bikegeek Mar 14, 2024
7d6d6fc
issue #383 removed hard-coded paths
bikegeek Mar 14, 2024
053551a
issue #383 rearrange content for clarity
bikegeek Mar 14, 2024
78cd00a
issue #383 fix incomplete sentences
bikegeek Mar 14, 2024
17f6f5c
issue #383 added the baseline_file and column_info_file
bikegeek Mar 14, 2024
41815df
issue #383 added instructions for the baseline_file and column_info_f…
bikegeek Mar 14, 2024
4fb1ac5
issue #383 basic system tests for TCMPR plotting
bikegeek Mar 14, 2024
6725039
System tests for TCMPR plotting
bikegeek Mar 14, 2024
b2b4629
Issue #383 include the tcmpr plotting system tests
bikegeek Mar 14, 2024
70179ea
issue #383 explicitly set hfip_bsln to 'no'
bikegeek Mar 14, 2024
3eeef9b
explicitly set hfip_bsln to no in testing
bikegeek Mar 14, 2024
d4a1f4a
change comparison syntax for hfip_bsln check
bikegeek Mar 14, 2024
ce446cf
change file size testing
bikegeek Mar 14, 2024
e4cbe50
change file size testing with assert False for mismatch
bikegeek Mar 14, 2024
49ac176
change file size testing-check mean line plots
bikegeek Mar 14, 2024
012fb8a
comment out file size testing, they are not consistent when run insid…
bikegeek Mar 14, 2024
35fa486
issue #383 Explicitly state that the TCMPR data must have all columns…
bikegeek Mar 25, 2024
4959795
Merge branch 'develop' of https://github.com/dtcenter/METplotpy into …
bikegeek Mar 25, 2024
2f7118e
issue #342 added two more settings to accomodate plotting for TCDiag …
bikegeek Mar 27, 2024
945cdb8
Add support for creating line plot
bikegeek Mar 31, 2024
29e95e8
Support for reading in the tcst reformatted file
bikegeek Mar 31, 2024
7b3b2c6
replace printing to stdout with logging
bikegeek Mar 31, 2024
18429ab
create log directory if one doesn't already exist
bikegeek Mar 31, 2024
6ef3f65
Added two more settings for the point plot to support generating line…
bikegeek Mar 31, 2024
39337ac
plot_list replaced with plot_type_list
bikegeek Mar 31, 2024
d128618
Merge branch 'feature_383_tcmpr_doc_test' of https://github.com/dtcen…
bikegeek Mar 31, 2024
37d2dda
Added two settings to support line plot for TCDiag data
bikegeek Mar 31, 2024
e263364
modify formatting
bikegeek Mar 31, 2024
14d4a6d
Merge branch 'feature_342_tcdiag_tcmpr_plotter' of https://github.com…
bikegeek Mar 31, 2024
a489578
Refactor to allow user to create either a scatter or line plot. Decr…
bikegeek Apr 1, 2024
4e75e4b
reformatted TCDiag from TC-Pairs output, to be used for testing
bikegeek Apr 1, 2024
321581c
Merge branch 'feature_342_tcdiag_tcmpr_plotter' of https://github.com…
bikegeek Apr 1, 2024
d6dba03
Config file for generating TCDiag simple time series for all forecast…
bikegeek Apr 1, 2024
d238306
Merge branch 'develop' of https://github.com/dtcenter/METplotpy into …
bikegeek Apr 16, 2024
9eaabe6
check for identical length of x- and y-values for line plot
bikegeek Apr 20, 2024
85c1049
pull out plot generating code from the main and into it's own functio…
bikegeek Apr 20, 2024
55acf74
Merge branch 'develop' of https://github.com/dtcenter/METplotpy into …
bikegeek Apr 20, 2024
b2eebcf
Clean up comments and formatting of the create_plot method
bikegeek Apr 21, 2024
3b6b2a9
replace existing logic with code that behaves more like unix mkdir -p…
bikegeek Apr 23, 2024
fc63783
Delete plots/config directory
bikegeek May 22, 2024
96d3631
Merge branch 'develop' into feature_342_tcdiag_tcmpr_plotter
hankenstein2 May 29, 2024
7854a76
Update tcmpr_config.py
hankenstein2 May 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions metplotpy/plots/config/tcmpr_defaults.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -145,3 +145,5 @@ subtitle: ''
prefix:
baseline_file: ./hfip_baseline.dat
column_info_file: ./plot_tcmpr_hdr.dat
is_tcdiag_linetype: False
connect_points: False
128 changes: 93 additions & 35 deletions metplotpy/plots/tcmpr_plots/box/tcmpr_point.py
Original file line number Diff line number Diff line change
@@ -1,37 +1,45 @@
import os
from datetime import datetime

import plotly.graph_objects as go

from metplotpy.plots import util
from metplotpy.plots.tcmpr_plots.box.tcmpr_box_point import TcmprBoxPoint
from metplotpy.plots.tcmpr_plots.tcmpr_series import TcmprSeries


class TcmprPoint(TcmprBoxPoint):
def __init__(self, config_obj, column_info, col, case_data, input_df, baseline_data):
super().__init__(config_obj, column_info, col, case_data, input_df, baseline_data)
print("--------------------------------------------------------")
print(f"Plotting POINT time series by {self.config_obj.series_val_names[0]}")
def __init__(self, config_obj, column_info, col, case_data, input_df, baseline_data, stat_name):
super().__init__(config_obj, column_info, col, case_data, input_df, baseline_data, stat_name)
# Set up Logging
self.point_logger = util.get_common_logger(self.config_obj.log_level, self.config_obj.log_filename)

self._adjust_titles()
self.series_list = self._create_series(self.input_df)
self.point_logger.info("--------------------------------------------------------")
self.point_logger.info(f"Plotting POINT time series by {self.config_obj.series_val_names[0]}")
start = datetime.now()

self._adjust_titles(stat_name)
self.series_list = self._create_series(self.input_df, stat_name)
self.case_data = None
self.cur_baseline = baseline_data['cur_baseline']
self.cur_baseline_data = baseline_data['cur_baseline_data']
self._init_hfip_baseline_for_plot()

if self.config_obj.prefix is None or len(self.config_obj.prefix) == 0:
self.plot_filename = f"{self.config_obj.plot_dir}{os.path.sep}{self.config_obj.list_stat_1[0]}_pointplot.png"
self.plot_filename = f"{self.config_obj.plot_dir}{os.path.sep}{stat_name}_pointplot.png"
else:
self.plot_filename = f"{self.config_obj.plot_dir}{os.path.sep}{self.config_obj.prefix}_pointplot.png"
self.plot_filename = f"{self.config_obj.plot_dir}{os.path.sep}{self.config_obj.prefix}_{stat_name}_pointplot.png"
# remove the old file if it exists

# remove the old file if it exist
if os.path.exists(self.plot_filename):
os.remove(self.plot_filename)
self._create_figure()

def _adjust_titles(self):
self.point_logger.info(f"Finished generating the TCMPR points in {datetime.now() - start} ms")

def _adjust_titles(self, stat_name):
if self.yaxis_1 is None or len(self.yaxis_1) == 0:
self.yaxis_1 = self.config_obj.list_stat_1[0] + '(' + self.col['units'] + ')'
self.yaxis_1 = stat_name + '(' + self.col['units'] + ')'

if self.title is None or len(self.title) == 0:
self.title = 'Point Plots of ' + self.col['desc'] + ' by ' \
Expand All @@ -57,28 +65,78 @@ def _draw_series(self, series: TcmprSeries) -> None:
boxpoints = 'all'

# create a trace
self.figure.add_trace(
go.Box(x=series.series_data['LEAD_HR'],
y=series.series_data['PLOT'],
mean=series.series_points['mean'],
notched=self.config_obj.box_notch,
line=line_color,
fillcolor=fillcolor,
name=series.user_legends,
showlegend=True,
# quartilemethod='linear', #"exclusive", "inclusive", or "linear"
boxmean=self.config_obj.box_avg,
boxpoints=boxpoints, # outliers, all, False
pointpos=0,
marker=dict(size=4,
color=marker_color,

# line plot, when connect_points is False in config file
if 'point' in self.config_obj.plot_type_list:
if self.config_obj.connect_points:
# line plot
mode = 'lines+markers'
else:
# points only
mode = 'markers'
# Create a point plot

# Ensure that the size of the list of x and y values
# are the same, or the resulting plot will be incorrect.
# This mismatch occurs when the x_list represents the
# available lead hours in the series data and the
# series_points has None where there isn't data corresponding
# to lead hours in the series_points dataframe.
#
y_list = series.series_points['mean']
x_list = series.series_data['LEAD_HR']
if len(x_list) != len(y_list):
# Clean up None values in the series.series_points['mean'] list
# The None values are assigned by the _create_series_points() method.
y_list = [y_values for y_values in y_list if y_values is not None]

self.figure.add_trace(
go.Scatter(x=x_list,
y=y_list,
showlegend=True,
mode=mode,
name=self.config_obj.user_legends[series.idx],
marker=dict(
color=marker_line_color,
size=8,
opacity=0.7,
line=dict(
width=1,
color=marker_line_color
),
symbol=marker_symbol,
),
jitter=0
),
secondary_y=series.y_axis != 1
)
color=self.config_obj.colors_list[series.idx],
width=1
)
),
),
secondary_y=series.y_axis != 1
)

# When a line plot is requested, connect any gaps
if self.config_obj.connect_points:
self.figure.update_traces(connectgaps=True)

else:
# Boxplot
self.figure.add_trace(
go.Box(x=series.series_data['LEAD_HR'],
y=series.series_data['PLOT'],
mean=series.series_points['mean'],
notched=self.config_obj.box_notch,
line=line_color,
fillcolor=fillcolor,
name=series.user_legends,
showlegend=True,
boxmean=self.config_obj.box_avg,
boxpoints=boxpoints, # outliers, all, False
pointpos=0,
marker=dict(size=4,
color=marker_color,
line=dict(
width=1,
color=marker_line_color
),
symbol=marker_symbol,
),
jitter=0
),
secondary_y=series.y_axis != 1
)

54 changes: 41 additions & 13 deletions metplotpy/plots/tcmpr_plots/tcmpr.py
Original file line number Diff line number Diff line change
Expand Up @@ -420,8 +420,11 @@ def save_to_file(self):

# Create the directory for the output plot if it doesn't already exist
dirname = os.path.dirname(os.path.abspath(self.plot_filename))
if not os.path.exists(dirname):
os.mkdir(dirname)
try:
os.makedirs(dirname, exist_ok=True)
except FileExistsError:
pass

self.logger.info(f'Saving the image file: {self.plot_filename}')
if self.figure:
try:
Expand Down Expand Up @@ -511,8 +514,6 @@ def perform_event_equalization(input_df:pd.DataFrame, is_skill:bool, config_obj:
return output_data




def main(config_filename=None):
"""
Generates a sample, default, TCMPR plot using a combination of
Expand Down Expand Up @@ -553,6 +554,24 @@ def main(config_filename=None):

config_obj = TcmprConfig(docs)

# Create the requested plot(s)
create_plot(config_obj)


def create_plot(config_obj: dict) -> None:
"""
One or more TCMPR plots is generated. Event equalization is performed if
it was requested by a setting in the yaml configuration file.

Args:
@param config_obj: The config object containing all the necessary information obtained
from the yaml configuration file.

Returns: None, creates one or more plots as specified in the yaml config file
"""

# Find input files, they must have the .tcst extension and filename must have
# the prefix "tc_pairs" (e.g. tc_pairs_gfso_20220401.tcst)
tcst_files = []
# list all .tcst files in tcst_dir
if config_obj.tcst_dir is not None and len(config_obj.tcst_dir) > 0 and os.path.exists(config_obj.tcst_dir):
Expand All @@ -566,7 +585,9 @@ def main(config_filename=None):
input_df = orig_input_df.copy(deep=True)

# Define a demo and retro column
# TODO these values never get used comment out for now

# Note: Currently not supported, leave commented out for now.

# input_df = orig_input_df.copy(deep=True)
# if config_obj.demo_yr is not None and config_obj.demo_yr != 'NA':
# demo_yr_obj = datetime.strptime(str(config_obj.demo_yr), '%Y')
Expand All @@ -579,18 +600,19 @@ def main(config_filename=None):
quotechar='"', skipinitialspace=True, encoding='utf-8')

logger = util.get_common_logger(config_obj.log_level, config_obj.log_filename)
\
for plot_type in config_obj.plot_type_list:

# Apply event equalization, if requested
# Event equalization is different for the skill_mn and skill_md
is_skill = False
if config_obj.use_ee:
if plot_type == 'skill_mn' or plot_type == 'skill_md':
is_skill = True
# perform event equalization on the skill_mn|skill_md plot type
logger.info(f"Perform event equalization for {plot_type}: {datetime.now()}")
output_result = perform_event_equalization(orig_input_df, is_skill, config_obj)
input_df = output_result
is_skill = True
# perform event equalization on the skill_mn|skill_md plot type
logger.info(f"Perform event equalization for {plot_type}: {datetime.now()}")
output_result = perform_event_equalization(orig_input_df, is_skill, config_obj)
input_df = output_result
else:
logger.info(f"Perform event equalization for {plot_type}: {datetime.now()}")
output_result = perform_event_equalization(orig_input_df, is_skill, config_obj)
Expand Down Expand Up @@ -641,7 +663,7 @@ def main(config_filename=None):
elif plot_type == 'skill_mn':
from metplotpy.plots.tcmpr_plots.skill.mean.tcmpr_skill_mean import TcmprSkillMean
plot = TcmprSkillMean(config_obj, column_info, col_to_plot, common_case_data, input_df,
cur_stat, baseline_data)
cur_stat, baseline_data)
elif plot_type == 'skill_md':
from metplotpy.plots.tcmpr_plots.skill.median.tcmpr_skill_median import TcmprSkillMedian
plot = TcmprSkillMedian(config_obj, column_info, col_to_plot, common_case_data, input_df, cur_stat)
Expand Down Expand Up @@ -683,7 +705,10 @@ def read_tcst_files(config_obj, tcst_files):
for file in tcst_files:
if os.path.exists(file):
print(f'Reading track data:{file}')
file_df = pd.read_csv(file, sep=r'\s+|;|:', header='infer', engine="python")
if config_obj.is_tcdiag:
file_df = pd.read_csv(file, sep='\t')
else:
file_df = pd.read_csv(file, sep=r'\s+|;|:', header='infer', engine="python")
file_df['LEAD_HR'] = file_df['LEAD'] / 10000
file_df['LEAD_HR'] = file_df['LEAD_HR'].astype('int')
all_filters = []
Expand All @@ -704,7 +729,10 @@ def read_tcst_files(config_obj, tcst_files):
# use numpy to select the rows where any record evaluates to True
mask = np.array(all_filters).all(axis=0)

file_df['VALID_TIME'] = pd.to_datetime(file_df['VALID'], format='%Y%m%d_%H%M%S') # 20170417_060000
if config_obj.is_tcdiag:
file_df['VALID_TIME'] = file_df['VALID']
else:
file_df['VALID_TIME'] = pd.to_datetime(file_df['VALID'], format='%Y%m%d_%H%M%S') # 20170417_060000
# Define a case column
file_df['equalize'] = file_df.loc[:, 'BMODEL'].astype(str) \
+ ':' + file_df.loc[:, 'STORM_ID'].astype(str) \
Expand Down
7 changes: 6 additions & 1 deletion metplotpy/plots/tcmpr_plots/tcmpr_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ class TcmprConfig(Config):
Prepares and organises Line plot parameters
"""
SUPPORTED_PLOT_TYPES = ['boxplot', 'point', 'mean', 'median', 'relperf', 'rank', 'skill_mn', 'skill_md']

def __init__(self, parameters: dict) -> None:
""" Reads in the plot settings from a box plot config file.

Expand All @@ -37,6 +38,9 @@ def __init__(self, parameters: dict) -> None:
"""
super().__init__(parameters)

self.is_tcdiag = self._get_bool('is_tcdiag_linetype')
self.connect_points = self._get_bool('connect_points')

# Logging
self.log_filename = self.get_config_value('log_filename')
self.log_level = self.get_config_value('log_level')
Expand Down Expand Up @@ -251,7 +255,8 @@ def _get_hfip_bsln(self) -> str:
"""

hfip_bsln = str(self.get_config_value('hfip_bsln'))
hfip_bsln = hfip_bsln.lower()
hfip_bsln_lower = hfip_bsln.lower()


# Validate that hfip_bsln is one of the following; (no, 0, 5, 10 year goal)
supported_bsln = ['no', '0', '5', '10']
Expand Down
12 changes: 9 additions & 3 deletions metplotpy/plots/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,8 @@
__author__ = 'Minna Win'

import argparse
from typing import Tuple
import sys
import getpass
import os
import logging
import gc
import re
Expand Down Expand Up @@ -316,6 +315,13 @@ def get_common_logger(log_level, log_filename):
currently in use by a plot type.
'''

# If directory for logfile doesn't exist, create it
log_dir = os.path.dirname(log_filename)
try:
os.makedirs(log_dir, exist_ok=True)
except OSError:
pass

# Supported log levels.
log_level = log_level.upper()
log_levels = {'DEBUG': logging.DEBUG, 'INFO': logging.INFO,
Expand All @@ -338,7 +344,7 @@ def get_common_logger(log_level, log_filename):
datefmt='%Y-%m-%d %H:%M:%S',
filename=log_filename,
filemode='w')
mpl_logger = logging.getLogger(name='matplotlib').setLevel(logging.CRITICAL)
logging.getLogger(name='matplotlib').setLevel(logging.CRITICAL)
common_logger = logging.getLogger(__name__)
f = cf()
common_logger.addFilter(f)
Expand Down
Loading