-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic but flexible query language #308
Conversation
…eries Using lark we establish grammar for the query, so we could do something like ((jim AND NOT haxby AND "important\" paper") OR ds_id:~"^000[3-9]..$" OR url:"example.com") AND metadata:non AND metadata[ex1,ex2]:"specific data" AND metadata[extractor2]:data so composition of booleans and queries within specific extractors etc
Also now committed with fixup pre-commit so isort kicked in etc
61144ea
to
e2671f9
Compare
ffd5549
to
62d884c
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #308 +/- ##
==========================================
- Coverage 98.73% 98.67% -0.06%
==========================================
Files 50 52 +2
Lines 2368 2492 +124
==========================================
+ Hits 2338 2459 +121
- Misses 30 33 +3 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know lark, so I just proofread the grammar in the syntax description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are going to refer to this "search" scheme as "query" in the web API, I think it would be more consistent if we replace "search" in the naming and references related to this scheme. We can rename search.py
and test_search.py
into query.py
and test_search.py
.
…help for search syntax Co-authored-by: Isaac To <[email protected]> Co-authored-by: John T. Wodder II <[email protected]>
I saw that coverage reported that
which I cannot figure out why our NOT not working as it should -- just returns empty lists :-/ def _get_str_search(self, arg: Token) -> ColumnElement[bool]:
value = self._get_str_value(arg)
return or_(*[f(value) for f in known_fields.values()]) any ideas would be appreciated or otherwise might be worth disabling NOT for such cases or altogether for now. |
yeah, I ran into it often too. Added 1em margin for spaceing it out now. |
yes, we can name it |
This space should not be there since all JSON data are in compact form in text. That space doesn't exist in `BIDSVersion` specification
A list is not needed here, and an iterator is more efficient.
`filter` is a built-in name. `criteria` is the corresponding parameter that is receiving the value
@candleindark I "merged" your PR. all good for a merge/deploy? ;) |
These names are more informative. Additionally, doc strings for these tests are also updated.
These names are appropriate now because of the context provided the containing class
known_fields_RepoUrl_1to1 = ["url", "ds_id", "head", "head_describe", "tags"] | ||
|
||
|
||
def get_ilike_search(model, field: str, value: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def get_ilike_search(model, field: str, value: str): | |
def _get_ilike_expr(model, field: str, value: str): |
This name is more consistent with SQLAlchemy's convention. I also feel that this function depends heavily on the context of the containing file and should be made private.
return getattr(model, field).ilike(_escape_for_ilike(value), escape=escape) | ||
|
||
|
||
def get_metadata_ilike_search(value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def get_metadata_ilike_search(value): | |
def _get_metadata_ilike_expr(value): |
This name is more consistent with SQLAlchemy's convention. I also feel that this function depends heavily on the context of the containing file and should be made private.
) | ||
|
||
|
||
def get_branches_ilike_search(value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def get_branches_ilike_search(value): | |
def _get_branches_ilike_expr(value): |
This name is more consistent with SQLAlchemy's convention. I also feel that this function depends heavily on the context of the containing file and should be made private.
super().__init__(*args, **kwargs) | ||
# Additional initialization here if needed | ||
|
||
def or_search(self, *args) -> ColumnElement[bool]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def or_search(self, *args) -> ColumnElement[bool]: | |
def or_exp(self, *args) -> ColumnElement[bool]: |
This name is more inline with SQLAlchemy's convention.
# args will be a list of the arguments to the OR operation | ||
return or_(*args) | ||
|
||
def and_search(self, *args) -> ColumnElement[bool]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def and_search(self, *args) -> ColumnElement[bool]: | |
def and_exp(self, *args) -> ColumnElement[bool]: |
This name is more inline with SQLAlchemy's convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those must correspond to the names of lark grammar -- defined above at
I don't mind changing here and there but then go ahead change and send PR so we make sure that it works
# args will be a single-element list containing the NOT argument | ||
return not_(arg) | ||
|
||
def get_field_select_search( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def get_field_select_search( | |
def get_field_select_expr( |
This name is more inline with SQLAlchemy's convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this and below -- I don't mind but make a complete change/refactoring and send as a PR or a commit making sure that all renamed/works.
raise ValueError(f"Unexpected number of args: {len(args)} in {args}") | ||
|
||
if field_l.data == "field_select": | ||
search = partial(self.get_field_select_search, *field_l.children) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
search = partial(self.get_field_select_search, *field_l.children) | |
get_query_exp = partial(self.get_field_select_search, *field_l.children) |
This name is more inline with SQLAlchemy's convention.
else: | ||
raise TypeError(arg.type) | ||
|
||
def _get_str_search(self, arg: Token) -> ColumnElement[bool]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _get_str_search(self, arg: Token) -> ColumnElement[bool]: | |
def _get_str_query_exp(self, arg: Token) -> ColumnElement[bool]: |
This name is more consistent the SQLAlChemy convention.
if not extractors: | ||
raise GrammarValueError( | ||
f"No extractors were specified in {metadata_extractors_l}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried, but couldn't come up with a test case for these lines. The coverage of these lines is the only thing I want before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they are not even reachable per se since lark grammar wouldn't allow for such a case. So it is more of an AssertionError here in particular. But I take it as a defensive coding to code under assumption we might get there. I would have just added the skip for coverage here.
The class is unneeded for all tests in the containing files are for testing the search functionality
Modify tests organization
As @yarikoptic suggested, let's merge this PR now. The remaining "issues" don't effect the functionalities. They are to be addressed in #320. |
An original target fake query which has various features was
and this PR implements most of them -- check the modal window description of supported syntax:
/search