Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update error handling and redesign Streamlit UI for post-processing #281

Merged
merged 21 commits into from
Mar 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
fb170b6
Fixed repeated series addition bug.
pineapple-cat Mar 11, 2024
eaf4e3f
Moved config information to sidebar.
pineapple-cat Mar 11, 2024
79261a7
Added simple streamlit exception message display.
pineapple-cat Mar 11, 2024
4daa43c
Changed config path arg to optional for streamlit and added empty con…
pineapple-cat Mar 13, 2024
3e4a340
Added plot placeholder, updated type hints + docstrings, and updated …
pineapple-cat Mar 15, 2024
35fc6b4
Updated type hints and docstrings.
pineapple-cat Mar 15, 2024
6deeb11
Updated Streamlit usage instructions.
pineapple-cat Mar 15, 2024
afe8957
Fixed type recast bug.
pineapple-cat Mar 15, 2024
66d89c9
Added toggle to show current config and fixed config validation bugs.
pineapple-cat Mar 18, 2024
35381b8
Added a convenience function for interpreting a value as a particular…
pineapple-cat Mar 18, 2024
31dc934
Fixed display filter value formatting and crash caused by uploading a…
pineapple-cat Mar 18, 2024
6593cea
Updated exception placement and plot reset upon error.
pineapple-cat Mar 22, 2024
2a51570
Changed filter addition to re-interpret types of all filters using cu…
pineapple-cat Mar 22, 2024
f4f30e7
Changed config handler filter values to be stored as strings for cons…
pineapple-cat Mar 22, 2024
ab17cb5
Changed axis options to update changes immediately.
pineapple-cat Mar 22, 2024
b354ce9
Updated config upload and made changes to allow smoother handling of …
pineapple-cat Mar 22, 2024
794fa6e
Merge branch 'main' into post-processing_ui-redesign
pineapple-cat Mar 22, 2024
7f2b6ee
Updated FIXME comments with issue numbers.
pineapple-cat Mar 25, 2024
6cff3f2
Fixed config upload bugs.
pineapple-cat Mar 25, 2024
acc5226
Merge branch 'post-processing_ui-redesign' of https://github.com/ukri…
pineapple-cat Mar 25, 2024
473f16e
Minor formatting adjustments for GH Pages docs.
pineapple-cat Mar 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 12 additions & 9 deletions post-processing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,14 @@ The post-processing scripts provided with the ExCALIBUR tests package are intend

There are four main post-processing components:

#### **`Perflog parsing`:**
#### **`Perflog parsing`**
- Data from benchmark performance logs are stored in a pandas DataFrame.
#### **`Data filtering`:**
#### **`Data filtering`**
- If more than one perflog is used for plotting, DataFrames from individual perflogs are concatenated together into one DataFrame.
- The DataFrame is then filtered, keeping only relevant rows and columns.
#### **`Data transformation`:**
#### **`Data transformation`**
- Axis value columns in the DataFrame are scaled according to user specifications.
#### **`Plotting`:**
#### **`Plotting`**
- A filtered and transformed DataFrame is passed to a plotting script, which produces a graph and embeds it in a simple HTML file.
- Users may run the plotting script to generate a generic bar chart. Graph settings should be specified in a configuration YAML file.

Expand Down Expand Up @@ -45,9 +45,9 @@ Run `post_processing.py -h` for more information (including debugging flags).

You may also run post-processing with Streamlit to interact with your plots:

```sh
streamlit run streamlit_post_processing.py log_path config_path [-p plot_type]
```
>```streamlit run streamlit_post_processing.py log_path -- [-c config_path]```

The config path is optional when running with Streamlit, as the UI allows you to create a new config on the fly. If you would still like to supply a config path, make sure to include `--` before any post-processing flags to indicate that the arguments belong to the post-processing script rather than Streamlit itself.

### Configuration Structure

Expand Down Expand Up @@ -147,10 +147,12 @@ y_axis:
x_value: "x_val_s"

filters:
and: [["filter_col_1", "<=", filter_val_1], ["filter_col_2", "!=", filter_val_2]]
and: [["filter_col_1", "<=", filter_val_1],
["filter_col_2", "!=", filter_val_2]]
or: []

series: [["series_col", "series_val_1"], ["series_col", "series_val_2"]]
series: [["series_col", "series_val_1"],
["series_col", "series_val_2"]]

column_types:
x_axis_col: "str"
Expand Down Expand Up @@ -296,5 +298,6 @@ All user-specified types are internally converted to their nullable incarnations
### Future Development

The post-processing capabilities are still a work in progress. Some upcoming developments:

- Embed graphs in GitHub Pages, instead of a bare HTML file.
- Add scaling and regression plots.
75 changes: 63 additions & 12 deletions post-processing/config_handler.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,23 @@
from pathlib import Path

import yaml


class ConfigHandler:

def __init__(self, config: dict):
def __init__(self, config: dict, template=False):
"""
Initialise class.

Args:
config: dict, plot configuration information.
template: bool, flag to skip config validation (unsafe).
"""

if not template:
# validate dict structure
config = read_config(config)

# validate dict structure
config = read_config(config)
# extract config information
self.title = config.get("title")
self.x_axis = config.get("x_axis")
Expand All @@ -19,6 +30,7 @@ def __init__(self, config: dict):
self.and_filters = []
self.or_filters = []
self.series_filters = []
self.to_string_filter_vals()
self.parse_filters()

# parse scaling information
Expand All @@ -34,15 +46,53 @@ def __init__(self, config: dict):
self.parse_columns()

@classmethod
def from_path(cfg_hand, config_path):
return cfg_hand(open_config(config_path))
def from_path(self, config_path: Path, template=False):
"""
Initialise class from a path.
"""
return self(open_config(config_path), template)

@classmethod
def from_template(self):
"""
Initialise class from an empty template. Skips config validation.
"""

return self(dict({
"title": None,
"x_axis": {"value": None, "units": {"custom": None}},
"y_axis": {"value": None, "units": {"custom": None}},
"filters": {"and": [], "or": []},
"series": [],
"column_types": {}}), template=True)

def get_filters(self):
"""
Return and, or, and series filter lists.
"""
return self.and_filters, self.or_filters, self.series_filters

def get_y_scaling(self):
"""
Return column and custom scaling information.
"""
return self.scaling_column, self.scaling_custom

def to_string_filter_vals(self):
"""
Store filter values as their string representations for internal consistency.
"""

# filters
if self.filters:
self.filters["and"] = ([[f[0], f[1], str(f[2])] for f in self.filters["and"]]
if self.filters.get("and") else [])
self.filters["or"] = ([[f[0], f[1], str(f[2])] for f in self.filters["or"]]
if self.filters.get("or") else [])

# series
self.series = [[s[0], str(s[1])] for s in self.series] if self.series else []

def parse_filters(self):
"""
Store filtering information from filters and series.
Expand Down Expand Up @@ -74,8 +124,10 @@ def parse_columns(self):
"""

# axis columns
self.plot_columns = [self.x_axis.get("value"), self.x_axis["units"].get("column"),
self.y_axis.get("value"), self.y_axis["units"].get("column")]
self.plot_columns = [self.x_axis.get("value"),
self.x_axis["units"].get("column") if self.x_axis.get("units") else None,
self.y_axis.get("value"),
self.y_axis["units"].get("column") if self.y_axis.get("units") else None]

# FIXME (issue #255): allow all series values to be selected with *
# (or if only column name is supplied)
Expand Down Expand Up @@ -131,15 +183,14 @@ def to_yaml(self):
return yaml.dump(self.to_dict(), default_flow_style=None, sort_keys=False)


def open_config(path):
def open_config(path: Path):
"""
Return a dictionary containing configuration information for plotting
from the path to a yaml file.

Args:
path: path, path to yaml config file.
path: Path, path to yaml config file.
"""

with open(path, "r") as file:
return load_config(file)

Expand All @@ -154,13 +205,13 @@ def load_config(file):
return yaml.safe_load(file)


def read_config(config):
def read_config(config: dict):
"""
Check required configuration information. At least plot title, x-axis,
y-axis, and column types must be present.

Args:
config: dict, config information.
config: dict, plot configuration information.
"""

# check plot title information
Expand Down
25 changes: 18 additions & 7 deletions post-processing/perflog_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,21 @@
import os
import re
from itertools import chain
from pathlib import Path

import pandas as pd


class PerflogHandler:

def __init__(self, log_path, debug=False):
def __init__(self, log_path: Path, debug=False):
"""
Initialise class.

Args:
log_path: Path, path to performance log file or directory.
debug: bool, flag to print additional information to console.
"""

self.log_path = log_path
self.debug = debug
Expand All @@ -18,6 +26,9 @@ def __init__(self, log_path, debug=False):
self.read_all_perflogs()

def get_df(self):
"""
Return dataframe containing performance log information.
"""
return self.df

def get_log_files(self):
Expand Down Expand Up @@ -80,7 +91,7 @@ def read_all_perflogs(self):
errno.ENOENT, "Could not find a valid perflog in path", self.log_path)


def read_perflog(path):
def read_perflog(path: Path):
"""
Return a pandas dataframe from a reframe performance log. The dataframe will
have columns for all fields in a performance log record except display name,
Expand All @@ -92,7 +103,7 @@ def read_perflog(path):
in reframe's configuration. See code.

Args:
path: path, path to log file.
path: Path, path to log file.
"""

# read perflog into dataframe
Expand Down Expand Up @@ -127,7 +138,7 @@ def read_perflog(path):
return df


def get_display_name_info(display_name):
def get_display_name_info(display_name: str):
"""
Return a tuple containing the test name and a dictionary of parameter names
and their values from the given input string. The parameter dictionary may be empty
Expand All @@ -145,15 +156,15 @@ def get_display_name_info(display_name):
return test_name, dict(params)


def insert_key_cols(df: pd.DataFrame, index, results):
def insert_key_cols(df: pd.DataFrame, index: int, results: 'list[dict]'):
"""
Modify a dataframe to include new columns (extracted from results) inserted at
a given index.

Args:
df: dataframe, to be modified by this function.
df: pd.DataFrame, to be modified by this function.
index: int, index as which to insert new columns into the dataframe.
results: dict list, contains key-value mapping information for all rows.
results: list[dict], contains key-value mapping information for all rows.
"""

# get set of keys from all rows
Expand Down
Loading