Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adds filtering via query params for /sample and /subject #26

Merged
merged 1 commit into from
Nov 17, 2023

Conversation

claymcleod
Copy link
Collaborator

@claymcleod claymcleod commented Nov 5, 2023

PR Close Date: November 17, 2023

This pull request adds filtering using query parameters to the /sample and /subject endpoints.

Parameter Naming

  • I have embedded all of the harmonized keys as top-level objects that can be filtered (?sex=F, ?race=Unknown, etc). This is because we can control the namespace for these values and ensure that they don't collide with other top-level query parameters (page and per_page, for example). It is also more convenient to not have to prefix everything with metadata..
  • Unharmonized keys currently require the prefix metadata.unharmonized. This is, again, to namespace the keys.

I am not 100% sure this is the right approach, but I wanted to put it out there and see what people thought.

Filtering

All harmonized (top-level) and unharmonized (nested under the metadata.unharmonized key) metadata fields are filterable. To achieve this, you can provide the field name as a string.

Filtering follows the following rules:

  • For single-value metadata field, the sample is included in the results if its value exactly matches the query string. Matches are case-sensitive.
  • For multiple-value metadata fields, the sample is included in the results if any of its values for the field exactly match the query string (a logical OR [||]). In the rare case an empty array is provided, no values can match, so the entity will always be filtered out. Matches are case-sensitive.
  • When no metadata field is provided (i.e., the value of the metadata key is null) or the metadata is present but the field in question is null (possible for both singular and multiple-valued metadata fields).
  • When multiple fields are provided as filters, a logical AND (&&) string together the predicates. In other words, all filters must match for a sample to be returned. Note that this means that servers do not natively support logical OR (||) across multiple fields: that must be done by calling this endpoint with each of your desired queries and performing aset union of those samples out of band.

Value Matching

At first, I implemented the filtering as required the provided parameter to be a substring of the entity's value (as opposed to exact matching, which is what I settled on for this PR). This became problematic when, for instance, I was filtering for ?sex=F and, because UNDIFFERENTIATED contains an F, those were coming back in my results.

For now, I have also required case-sensitive matching. I wanted to hear from folks whether they thought this was the right approach. Briefly, since we are requiring exact matching for the reason I listed above, I felt that keeping the exact casing was in step with that.

I think exact matching is a fine enough strategy for now—I would not, however, be opposed to adding case-insensitive and inexact matching in the future if there is interest. I do think those two concepts go hand in hand so, if we want to implement them, we should do it at the same time.

Before submitting this PR, please make sure:

  • You have added a few sentences describing the PR here.
  • You have added yourself or the appropriate individual as the assignee.
  • You have added the relevant groups/individuals to the reviewers.
  • Your commit messages conform to the Conventional Commits standard.
  • You have updated the README or other documentation to account for these changes (when appropriate).

@claymcleod claymcleod added the api-enhancement New feature or request label Nov 5, 2023
@claymcleod claymcleod self-assigned this Nov 5, 2023
@claymcleod claymcleod changed the title feat: adds filtering query params feat: adds filtering via query params for /sample and /subject Nov 5, 2023
@asafieva
Copy link
Collaborator

asafieva commented Nov 7, 2023

@claymcleod Could we get an example of a URL with "OR" condition filtering construct? Say a property "disease_phase" defined by "cde.v1.sample.DiseasePhase" for a subset of some values as ["Relapse/Progression" or "Initial Diagnosis"] taken from CDE 12217251?

@claymcleod claymcleod changed the base branch from main to feat/pagination November 9, 2023 15:37
@claymcleod
Copy link
Collaborator Author

claymcleod commented Nov 9, 2023

@asafieva, take a look at this comment in the spec. In short, if you want the server to do a logical OR for you, that would not be supported in this model. Of course, you can always do a logical OR yourself by running multiple queries and then doing the set union of the results on the client side.

This was an intentional decision to keep things as simple as possible. However, I'm open to designing a more complex filtering system if we feel we need that. I do hesitate to get too complex right off of the bat, because that adds a decent amount of maintenance overhead to all of the servers to support such a scheme.

Copy link
Collaborator

@e-t-k e-t-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the unprefixed harmonized keys vs metadata.harmonized.
Also strongly agree with exact match & case sensitive only; we can add more versatile filters later.

swagger.yml Outdated Show resolved Hide resolved
@claymcleod claymcleod force-pushed the feat/add-query-params branch 6 times, most recently from 2e1023b to e2a7154 Compare November 17, 2023 01:55
Base automatically changed from feat/pagination to main November 17, 2023 20:37
@claymcleod claymcleod merged commit d1c4ccf into main Nov 17, 2023
5 checks passed
@claymcleod claymcleod deleted the feat/add-query-params branch November 17, 2023 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants