You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The following code produces a syntax error in python due to the line break before the colon, but tree-sitter-python parses it as valid code:
deffoo(x)
:
returnx+2
This happens because \s is included in the extras parameter[1], telling tree-sitter to ignore whitespace (and therefore newlines) between any two characters.
Replacing \s by \t in extras causes tree-sitter-python to correctly reject newlines such as the above[2]. However, after doing so it longer escape newlines correctly inside brackets. Consider the following valid python:
a= (
1+2
)
This fails to parse because tree-sitter does not expect newlines at the end of lines 1 and 2. The scanner.cc logic to ignore line breaks inside bracket expressions depends on close bracket being a valid token[3], which it is not following an open paren or the plus operator.
Is disallowing arbitrary newlines in general while permitting them inside brackets something that is possible to accomplish with tree-sitter?
[2] To avoid rejecting all empty lines we'd also have to replace module: $ => repeat($._statement) with something like module: $ => repeat(choice($._statement, /\r?\n/))
I thought the policy of tree-sitter was too parse all valid code, but also make the most sense out of invalid code. Though your examples show that also valid code gets rejected. If there's a solution for this issue that would fix the failure cases, but also parse some invalid cases, maybe that could get favored over something that would require more scanner logic with state.
The following code produces a syntax error in python due to the line break before the colon, but tree-sitter-python parses it as valid code:
This happens because
\s
is included in theextras
parameter[1], telling tree-sitter to ignore whitespace (and therefore newlines) between any two characters.Replacing
\s
by\t
inextras
causes tree-sitter-python to correctly reject newlines such as the above[2]. However, after doing so it longer escape newlines correctly inside brackets. Consider the following valid python:This fails to parse because tree-sitter does not expect newlines at the end of lines 1 and 2. The scanner.cc logic to ignore line breaks inside bracket expressions depends on close bracket being a valid token[3], which it is not following an open paren or the plus operator.
Is disallowing arbitrary newlines in general while permitting them inside brackets something that is possible to accomplish with tree-sitter?
[1]
tree-sitter-python/grammar.js
Line 32 in b14614e
[2] To avoid rejecting all empty lines we'd also have to replace
module: $ => repeat($._statement)
with something likemodule: $ => repeat(choice($._statement, /\r?\n/))
[3]
tree-sitter-python/src/scanner.cc
Line 157 in b14614e
The text was updated successfully, but these errors were encountered: