-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(parser): ensure all loops advance parsing, fuzz with arbitrary bytes #828
Conversation
This also fixes a case where bogus input was accepted, and did not raise a parse error: ```graphql schema { query: Query { mutation: Mutation { subscription: Subscription } ```
…irective locations parser
p.err("expected at least one Selection in Selection Set"); | ||
// If there is no token, | ||
None => { | ||
p.err_and_pop("expected an Inline Fragment or a Fragment Spread"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changing this from p.err()
to p.err_and_pop()
fixes the infinite loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could a similar infinite loop could easily happen if another call to p.err()
is added elsewhere in the future that should be p.err_and_pop()
instead? What would be a good place to warn against this gotcha? The doc-comment for p.err()
may be easy to miss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a good answer for this, I will merge and file an issue
crates/apollo-parser/test_data/parser/err/0054_root_operation_type_with_extra_brackets.graphql
Show resolved
Hide resolved
p.err("expected at least one Selection in Selection Set"); | ||
// If there is no token, | ||
None => { | ||
p.err_and_pop("expected an Inline Fragment or a Fragment Spread"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could a similar infinite loop could easily happen if another call to p.err()
is added elsewhere in the future that should be p.err_and_pop()
instead? What would be a good place to warn against this gotcha? The doc-comment for p.err()
may be easy to miss.
This addresses an issue where the parser could get stuck in a loop if the token limit was reached at a specific point. Most egregiously, the parser would get in an infinite loop if the token limit was reached in the middle of a fragment spread:
{ ... <LIMIT> fragmentName }
.To catch cases like this, this PR introduces
p.peek_while()
, which replaceswhile let Some() = p.peek()
in the parser. In debug modepeek_while()
asserts that parsing was advanced by the iteration. My current solution to cases that did not advance parsing is to change error reports to usep.err_and_pop()
, which consumes the token that caused the error. This probably isn't always the most correct thing for error-tolerance, since the token may be a closing token or something that could be interpreted more optimally, but we're not yet diligent about that to begin with and running into an infinite loop is worse :)p.peek_while()
requires you returnControlFlow
, which lets you break out of the loop when the peeked token did not match an expected token.p.peek_while_kind()
supports a simpler case where you expect one specific token kind to come up every time. If that token kind is spotted, you must parse at least that token. You can't break out of the loop early, it will end when the condition doesn't match anymore.p.parse_separated_list()
supports the various separated lists with optional prefix in the spec:& Interface
,| Union
, and| DirectiveLocation
. Using this also makes directive location parsing no longer recursive (it was recursive but didn't have a recursion limit until now! 😱 )Root operation type parsing is no longer recursive.
This also adds a fuzz test using completely arbitrary strings as input to the parser--all our other fuzz tests generate a document with apollo-smith, which is useful in its own way, but doesn't do a great job of finding issues in error edge cases like this. The new fuzz tests has a token limit as well. I found a few more cases that could panic with this.