Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

326 forward link #13

Merged
merged 12 commits into from
May 22, 2024
Merged

326 forward link #13

merged 12 commits into from
May 22, 2024

Conversation

AntonZogk
Copy link
Collaborator

@AntonZogk AntonZogk commented May 15, 2024

Summary

get_link function for forward or backward imputation
Zerofy_values for ignoring values when calculating links based on a expression

Checklists

This pull request meets the following requirements:

  • installable with all dependencies recorded
  • runs without error
  • follows PEP8 and project specific conventions
  • appropriate use of comments, for example no descriptive comments
  • functions documented using Numpy style docstings
  • assumptions and decisions log considered and updated if appropriate
  • unit tests have been updated to cover essential functionality for a reasonable range of inputs and conditions
  • other forms of testing such as end-to-end and user-interface testing have been considered and updated as required
  • tests suite passes (locally as a minimum)
  • peer reviewed with review recorded

Both of these are not applicable since this pull request relates to intermediate functions for the impute method

  • installable with all dependencies recorded is not applied here
  • other forms of testing such as end-to-end and user-interface testing have been considered and updated as required

@AntonZogk AntonZogk requested a review from Jday7879 May 15, 2024 12:57
@AntonZogk
Copy link
Collaborator Author

  • zerofy_values will update the input dataframe, let me know if you think it should return a new series or dataframe
  • i included some basic scenarios in the unit tests because i think at the moment we don't have a clear structure and if i add csv files to tests will start getting messy, we can easily add more scenarios as needed

@AntonZogk AntonZogk marked this pull request as ready for review May 15, 2024 13:08
Copy link
Collaborator

@Jday7879 Jday7879 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work Anton, I've left a few suggestions and comments. Happy to discuss further

def test_basic_filter(self):
"""Test a basic filter, filters questions with identifier different to 20001"""

expected = pd.DataFrame(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the bulk of this dataframe is used a few times in the unit tests. I agree that having fewer files to load makes the repo cleaner, but having one csv which has the setup would remove the repeated setup. I also appreciate that the "question" column changes depending on the filter applied which makes it more difficult to use a csv

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you about this, another point is according to pytest docs

"Something to be aware of when grouping tests inside classes is that each test has a unique instance of the class. Having each test share the same class instance would be very detrimental to test isolation and would promote poor test practices"

Not sure about appropriate path to save it though, perhaps we should save them in a folder test_data ?

src/forward_link.py Outdated Show resolved Hide resolved
src/forward_link.py Outdated Show resolved Hide resolved
src/forward_link.py Outdated Show resolved Hide resolved

link.replace(np.nan, 1, inplace=True) # set defaults

return link
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other functions return dataframe, but I think the point about returning dataframe or series is a good one and should probably be discussed (i.e. how will all of the functions link together). Happy for this to be left as is and a ticket added to the backlog as other functions might need to be refactored

tests/test_forward_link.py Outdated Show resolved Hide resolved
@AntonZogk AntonZogk marked this pull request as draft May 17, 2024 09:26
@AntonZogk AntonZogk marked this pull request as ready for review May 17, 2024 16:10
@AntonZogk
Copy link
Collaborator Author

Made some changes in the test approach, also please ignore the mask_values since needs further discussions

@AntonZogk AntonZogk changed the title Add function for forward, backward link 326 forward link May 20, 2024
Copy link
Collaborator

@Jday7879 Jday7879 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor point to consider, happy for it to be commit as is but might be worth to raise with another team member to get their input

src/forward_link.py Outdated Show resolved Hide resolved
tests/test_forward_link.py Show resolved Hide resolved
@AntonZogk AntonZogk requested a review from Jday7879 May 21, 2024 10:16
Copy link
Collaborator

@Jday7879 Jday7879 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy for code to be merged following minor changes. Nice work Anton!

@AntonZogk AntonZogk merged commit 0247a01 into main May 22, 2024
3 checks passed
@AntonZogk AntonZogk deleted the 326-forward-link branch May 22, 2024 09:53
@AntonZogk AntonZogk mentioned this pull request Jun 10, 2024
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants