Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Enable better filtering for missing values #112

Closed
1 task done
michellewang opened this issue Nov 21, 2023 · 4 comments · Fixed by #128
Closed
1 task done

[ENH] Enable better filtering for missing values #112

michellewang opened this issue Nov 21, 2023 · 4 comments · Fixed by #128
Assignees

Comments

@michellewang
Copy link
Collaborator

michellewang commented Nov 21, 2023

Is there an existing issue for this?

  • I have searched the existing issues

New feature

The default Plotly Dash filtering options does not allow the user to select for missing values, which means they cannot check which subject-session record have missing values for a column of interest.

This is mainly relevant for the phenotypic bagel, since the schema does not specify missing values for columns unlike the imaging bagel.

@alyssadai proposed (on Slack) to fill missing values with some hardcoded value ("", "n/a"). This would probably work except there might be confusion if the raw data itself already has a dedicated NA value. It is also unclear to me what the behaviour would be in non-text columns: for example, adding empty strings to a numerical column might make it a text column instead of numerical (?)

Another potential solution would be to not allow any missing values in the input file (i.e., force the user to do their own filling in of missing values). So the users would need to choose how to denote their own text/numerical/datetime missing values

@alyssadai alyssadai moved this to Backlog in Neurobagel Nov 27, 2023
@surchs surchs moved this from Backlog to Specify - Done in Neurobagel Dec 6, 2023
@alyssadai alyssadai moved this from Specify - Done to Implement - Track in Neurobagel Dec 18, 2023
@alyssadai
Copy link
Collaborator

Blocked because of Alyssa's time spent on CLI/modeling issues for now

@alyssadai alyssadai moved this from Implement - Track to Implement - Active in Neurobagel Jan 15, 2024
@alyssadai alyssadai self-assigned this Jan 15, 2024
@alyssadai
Copy link
Collaborator

alyssadai commented Jan 15, 2024

Hey @michellewang, thanks for your patience on this feature!

So, after much digging and trial & error I discovered that there is actually a special, non-documented (even chatgpt didn't have the right answer 🙃) syntax for filtering missing values, which is is blank or is nil.

Given this, I think the simplest solution for now would be to add a text element (maybe a collapsible element?) near the datatable with explicit instructions on filtering for missing values + a link to the rest of the filtering syntax. This also avoids having to modify the underlying data in any way.

Notably, it appears that there's currently no syntax for the opposite, filtering for non-missing values. Assuming this would also be helpful, I can try to implement as part of this issue custom filtering logic for the query is not blank to accomplish this - I suspect this might be most intuitive given the above.

Let me know what you think!

For internal documentation's sake, some issues/posts that led to the answer:

@alyssadai
Copy link
Collaborator

A quick update on this. I've realized that supporting custom filtering syntax for non-missing values (e.g., is not blank) would most likely require switching entirely from the native filtering to custom backend filtering for all the current operators, since having a mix of the built-in filtering behaviour and a custom column filtering behaviour is not natively supported and would be quite complex to coordinate.

Given the growing limitations of Dash DataTable for our purposes, I opened an issue for migrating to Dash AG Grid which provides much more advanced filtering functionality natively (incl. filtering for blank and non-blank values) as well as other advantages that likely make it a better long term solution for digest.

TLDR

@michellewang
Copy link
Collaborator Author

Thanks for looking into this! Great to hear about is blank -- it's surprising that this is not documented so maybe it wouldn't be good to rely on it in any case.

I briefly checked out Dash AG Grid and it looks like switching to it would be the way to go. Hopefully it's not too much work to do!

@surchs surchs moved this from Implement - Done to Review - Active in Neurobagel Jan 17, 2024
@github-project-automation github-project-automation bot moved this from Review - Active to Review - Done in Neurobagel Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
2 participants