Initial suite of tests #54

eschrom · 2024-11-15T16:58:00Z

Many tests are included here for data cleaning, uptake data manipulating, and model building. Writing tests inspired some moderate reorganizations/fixes in init.py. The tests for model building are a bit sparse, because this portion of the code will be significantly reworked (requiring new tests) when numpyro is introduced. Still, it would be good to get some legitimate testing infrastructure incorporated.

swo

A number of suggestions and some requests for documentation, but nothing critical or blocking. I think a lot of the stuff that I'm thinking should be tweaked are things that we'll develop joint understanding of, when we implement a second model.

swo · 2024-11-18T22:19:07Z

iup/__init__.py

@@ -225,6 +225,44 @@ def trim_outlier_intervals(

        return IncidentUptakeData(df)

+    def expand_implicit_columns(self, group_cols: tuple[str,] | None) -> Self:


I'm not sure what the right word is, but "expand" is tricky. It sounds to me like "explode" or "extend," which each have different meanings in the polars world

In R broom, they call things like this "augment"

Not super important at all

"Augment" is actually the word I found myself tending toward when writing documentation, so I will happily change it.

swo · 2024-11-18T22:23:11Z

iup/__init__.py

+        - daily-average uptake in the interval preceding the previous date
+        """
+        self = (
+            self.with_columns(


This function would be better as a class method, rather than an object method

I think that will make #42 easier to reason about

The self = whatever has a really bad code smell

I will add this to issue #42.

swo · 2024-11-18T22:24:39Z

iup/__init__.py

@@ -318,7 +356,7 @@ def parse_nis(
        format of the dates in the NIS date column
    rollout: dt.date
        date of rollout
-    filters: dict
+    filters: dict | None


Some fun notes: dict | None is equivalent to Optional[dict]

And you could specify the key and value types with dict[str, Any] (or replace Any which whatever type you want)

Good to know! I will open a new issue #55 to remind me to make type hinting as precise as possible throughout.

swo · 2024-11-18T22:29:19Z

iup/__init__.py

@@ -406,7 +445,7 @@ def select_columns(
        name of the NIS column for the uptake estimate (population %)
    date_col: str
        name of the NIS column for the date
-    group_cols: dict
+    group_cols: dict | None


This is kind of confusing, that it's both selection and renaming

Like, {"age_group": "age_group"} means "keep age group," but {"age": "age_group"} means "keep the column 'age' and rename it", and {"raw_point_estimate": "estimate"} means "there are no group columns but please also rename the point estimate column."

I think it makes a lot more sense for group cols to be just a list (or set) of strings

While it might be a touch confusing when inspecting select_columns for the first time, I think it makes a really intuitive interface for the user in the config.yaml. You just specify a dictionary of: {"what_the_grouping_column_is_called" : "what_I_want_the_grouping_column_to_be_called"}. This makes sure the user enters the same number of actual vs. desired grouping column names, whereas a list for each could be different lengths (raising errors downstream).

The 3rd example you raise shouldn't be an issue, because this is only for grouping columns. While {"raw_point_estimate": "estimate"} would indeed rename a column, it would also cause errors downstream because the point estimate would be erroneously used as a grouping factor.

swo · 2024-11-18T22:30:49Z

iup/__init__.py

+        else:
+            return (
+                pl.when(x.drop_nulls().n_unique() == 1)
+                .then(x * 0.0)


This should have some documentation (that z-score of a single data point is defined to be zero, which seems sensible)

Suggested change

.then(x * 0.0)

.then(0.0)

Change made and documentation added!

swo · 2024-11-19T00:49:40Z

tests/test_model_building.py

+                "estimate": [0.0, 0.0, 1.0, 0.1, 3.0, 0.3, 4.0, 0.4],
+                "season": ["2019/2020"] * 8,
+                "elapsed": [0, 0, 7, 7, 14, 14, 21, 21],
+                "interval": [None, None, 7, 7, 7, 7, 7, 7],


[None] * 2 + [7] * 6 might be easier to read, but it's sort of six-to-one

swo · 2024-11-19T14:08:15Z

tests/test_model_building.py

+        frame, ("estimate", "elapsed")
+    )
+
+    assert len(output) == 2


it's a minor thing, but I might be more specific here. Can you you write out what you expect output to be in this situation, like the exact data frame? That makes it easier for me as the code reviewer to be 100% sure about what you want this function to be doing

Sure, done.

swo · 2024-11-19T14:08:50Z

tests/test_model_building.py

+    output = iup.LinearIncidentUptakeModel()
+    output = output.fit(frame, ("geography",))
+
+    assert output.model.score(output.x, output.y) == 1.0


would be nice to also check that the coefficient(s) in the model is precisely what you think it was

Good idea. As we add more types of models, I think it would be good to choose a single fixture data frame for testing that has the right edge cases and known outputs to thoroughly test all models.

swo · 2024-11-19T14:09:48Z

tests/test_model_building.py

+        start, start_date, end_date, interval, group_cols
+    )
+
+    assert output.shape[0] == 5


you could write this as output.shape == (5, 7)

and this is another place that saying the exact data frame you expect would be good

swo · 2024-11-19T14:10:38Z

tests/test_model_building.py

+    assert output.shape[1] == 8
+
+
+def test_project_sequentially():


love this test. are there any other trivial or edge cases that are easy to test?

Likely so - I'll give it some thought but merge what I have so far.

eschrom added 13 commits November 13, 2024 13:01

Added pytest to poetry.lock and pyproject.toml

13930b5

Initial tests and fixes to verify my workflow

1806224

Tests for data cleaning steps

62281b5

Started tests for uptake data objects

bb7ee4a

Finished test_uptake_data.py

41ec107

Reorganized LIUM class to be more modular

964432f

Finished documenting the more modular LIUM functions

cd4e230

Fixed standardize to deal with std dev = 0

00507eb

Started tests for model building

d369c89

Fixed some data type confusion

aa66fac

Finished reorganization and tests for model building

fd88f7a

Deleted placeholder test

bc10d7e

Fixed some documentation in testing

0188e72

eschrom requested review from swo and Fuhan-Yang November 15, 2024 16:58

swo approved these changes Nov 19, 2024

View reviewed changes

Addressed various minor PR#54 review comments

7ee094f

eschrom merged commit e781432 into main Nov 19, 2024
2 checks passed

eschrom deleted the ecs_testing branch November 19, 2024 16:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial suite of tests #54

Initial suite of tests #54

eschrom commented Nov 15, 2024

swo left a comment

swo Nov 18, 2024

eschrom Nov 19, 2024

swo Nov 18, 2024

eschrom Nov 19, 2024

swo Nov 18, 2024

eschrom Nov 19, 2024

swo Nov 18, 2024

eschrom Nov 19, 2024 •

edited

Loading

swo Nov 18, 2024

eschrom Nov 19, 2024

swo Nov 19, 2024

swo Nov 19, 2024

eschrom Nov 19, 2024

swo Nov 19, 2024

eschrom Nov 19, 2024 •

edited

Loading

swo Nov 19, 2024

swo Nov 19, 2024

eschrom Nov 19, 2024

		@@ -225,6 +225,44 @@ def trim_outlier_intervals(

		return IncidentUptakeData(df)

		def expand_implicit_columns(self, group_cols: tuple[str,] \| None) -> Self:

Initial suite of tests #54

Initial suite of tests #54

Conversation

eschrom commented Nov 15, 2024

swo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eschrom Nov 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eschrom Nov 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eschrom Nov 19, 2024 •

edited

Loading

eschrom Nov 19, 2024 •

edited

Loading