-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make it possible to disallow slow queries (or warn user about the problem) #315
Conversation
"Some of the rules would require a Yara scan of every indexed " | ||
"file, and this is not allowed by this instance. " | ||
f"Problematic rules: {degenerate_rule_names}. " | ||
f"Read {doc_url} for more details." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link is not clickable, I guess it should :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not possible with an error message (without severe hacks, like putting HTML code into a Python exception, or - even worse - parsing the error message on the frontend).
I guess I'll do it the harder way, and put the related alert logic into the frontend. This will require some changes, so I'll address this a bit later (in this PR).
Sometimes users really want to just list all the files that has a certain imphash value, and it's very important for their research. (i hope this falls under "degenerate query")
On other times, users don't really understand the problem. Combining warn-only mode, and #297 will make sure that even if the user really wants to proceed, they can and it will not be too bad for the system (thanks to the file-limit) |
@@ -335,11 +335,28 @@ def query( | |||
rule_author=rule.author, | |||
is_global=rule.is_global, | |||
is_private=rule.is_private, | |||
is_degenerate=rule.parse().is_degenerate, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could probably call rule.parse()
just once, somewhere above? In case it gets expensive in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance hit is negliglible, and even if it wasn't, it's already cached:
def parse(self) -> UrsaExpression:
if self.__parsed is None:
self.__parsed = self.__parse_internal()
return self.__parsed
I worry more about readability. There's no way to assign it to a temporary variable in a list comprehension, but I can rewrite this all to an explicit for loop if you prefer 🤔 It would look like this:
response_rules = []
for rule in rules:
parsed_rule = rule.parse()
response_rules.append(
ParseResponseSchema(
rule_name=rule.name,
rule_author=rule.author,
is_global=rule.is_global,
is_private=rule.is_private,
is_degenerate=parsed_rule.is_degenerate,
parsed=parsed_rule.query,
)
)
return response_rules
I prefer the current version, but I'm not overly attached to it if you prefer this one.
Co-authored-by: Michał Praszmo <[email protected]>
It does :). No viable ngrams == "degenerate".
ACK, thanks for input. I'll rework the code then to just warn the user in the 'non-enforcing' mode. |
So 3 modes? Allow\Ignore, Warn, Prevent\Block? |
it can also be nice if file-limit can be configured to work:
|
I was considering it, but I think just two modes - So in short, I don't see the application for
Haven't thought of that yet. I don't immediately like the idea, because:
But this is probably discussion for #297 |
@nazywam @ITAYC0HEN can you re-review? Changes since the last time:
Also I've sneakily fixed broken CSS in the dropdowns. Things I didn't change:
|
LGTM overall. Didn't check, but I hope that this feature works smartly with API requests as well, in some intuitive way. |
Don't know about smartly, but it works:
|
Your checklist for this pull request
What is the current behaviour?
#308
What is the new behaviour?
If configuration option
query_disallow_degenerate
is set to "true", users won't be able to run any queries that would end up doing a whole backend scan. Of course they can still run very inefficient queries like"aaa" | "bbb" | ccc" | "ddd"
or just{00 00 00}
, so this won't help in every case, but that's all we can do at this stage (others will be handled by #297).I know this is not exactly what #308 asked for. I wonder if anyone really wants to use the "soft" mode - warn only, don't reject the query. If not, I think it's good as just a configuration option.
Closing issues
fixes #308