-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tokenization: improve EOF and multi-line statement interaction #1961
Conversation
… multi-line indicator
This reverts commit 2859791.
Maybe worth a little more pause here; looking at the current upstream |
Eh, ok - I'm going to mark this as ready for review again, having read up a bit more on the status of |
We've modified our version of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually within a minute I created something that seems to hang Black?
test = (1
+
2
This is definitely a no-no even if it's invalid code (EOF in multi-line continuation) IMO.
Thanks @ichard26 - I'll take a look into that. |
… after a multi-line indicator" This reverts commit b2c6670.
This works in 'fast' mode but appears to introduce some code output equivalence issues
The latest pushed changes may appear to work based on the test coverage, but in fact there are code output equivalence issues. The following input file (with no closing end-of-line) produces a different abstract syntax tree during the second formatter pass:
(it works with the |
I think this is closable as a wont-fix; I'll add some more explanation soon in the linked issue. |
(in hindsight, assuming a wont-fix resolution wasn't sensible of me; this could still be addressed if there's appetite to) |
(just cleaning up and closing some old / stale pull requests that I've been working on. this turned out to be a more-involved fix than it looked like originally; some adjustments were made to the |
@jayaddison and we really appreciate it btw! We want to refactor the single-file Black package into a multi-module package to improve maintainability (see GH-1746). Unfortunately such a change would break basically every PR, so we're working on reducing the PR backlog. |
This is the kind of change that can cause subtle problems in future, so it's one that should be reviewed and considered carefully.
Currently during fuzz testing,
hypothesmith
is able to find some example cases where apparently-valid Python code that contains a multi-line indicator (\
character) immediately prior to the EOF indicator (without a newline) fails duringblack
's tokenization stage.This has been reported in #1012 and more recently #1960.
As per the notes in #1012, using code of this kind as input will successfully compile and run in Python.
This changeset permits the
black
tokenizer to continue when an EOF token is encountered following a multi-line indicator.Resolves #1012.