-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
326 forward link #13
326 forward link #13
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work Anton, I've left a few suggestions and comments. Happy to discuss further
tests/test_forward_link.py
Outdated
def test_basic_filter(self): | ||
"""Test a basic filter, filters questions with identifier different to 20001""" | ||
|
||
expected = pd.DataFrame( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the bulk of this dataframe is used a few times in the unit tests. I agree that having fewer files to load makes the repo cleaner, but having one csv which has the setup would remove the repeated setup. I also appreciate that the "question" column changes depending on the filter applied which makes it more difficult to use a csv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you about this, another point is according to pytest docs
"Something to be aware of when grouping tests inside classes is that each test has a unique instance of the class. Having each test share the same class instance would be very detrimental to test isolation and would promote poor test practices"
Not sure about appropriate path to save it though, perhaps we should save them in a folder test_data ?
src/forward_link.py
Outdated
|
||
link.replace(np.nan, 1, inplace=True) # set defaults | ||
|
||
return link |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other functions return dataframe, but I think the point about returning dataframe or series is a good one and should probably be discussed (i.e. how will all of the functions link together). Happy for this to be left as is and a ticket added to the backlog as other functions might need to be refactored
Made some changes in the test approach, also please ignore the mask_values since needs further discussions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor point to consider, happy for it to be commit as is but might be worth to raise with another team member to get their input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy for code to be merged following minor changes. Nice work Anton!
Summary
get_link function for forward or backward imputation
Zerofy_values for ignoring values when calculating links based on a expression
Checklists
This pull request meets the following requirements:
Both of these are not applicable since this pull request relates to intermediate functions for the impute method