-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-125038: Crash after genexpr.gi_frame.f_locals manipulations is fixed #125178
gh-125038: Crash after genexpr.gi_frame.f_locals manipulations is fixed #125178
Conversation
…is fixed Some iterator checks are added for _FOR_ITER, _FOR_ITER_TIER_TWO and INSTRUMENTED_FOR_ITER bytecode implementations. TypeError is raised in case of tp_iternext == NULL. Tests on generator modifying through gi_frame.f_locals are added, both to genexpr generators and function generators.
Misc/NEWS.d/next/Core_and_Builtins/2024-10-09-13-53-50.gh-issue-125038.ffSLCz.rst
Outdated
Show resolved
Hide resolved
…e-125038.ffSLCz.rst Co-authored-by: Kirill Podoprigora <[email protected]>
Python/bytecodes.c
Outdated
if (iternext == NULL) { | ||
_PyErr_Format(tstate, PyExc_TypeError, | ||
"'for' requires an object with " | ||
"__iter__ method, got %.100s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error is incorrect as it's looking for __next__
, not __iter__
. I think we can use the same error as builtins.next()
:
PyErr_Format(PyExc_TypeError,
"'%.200s' object is not an iterator",
Py_TYPE(it)->tp_name);
I think this is wrong approach. Rather than modifying Currently, the bytecode compiler assumes that local variables in generator expressions cannot be modified in The bytecode for
Compare that with the code for
Note the additional I think adding the |
So, basically, we want to provide equal results for such code fragments:
Have I got your idea correct on high level? IMHO, this will change be convenient. |
Yes, I think those should provide the same results. |
We also want to avoid emitting In other words, this function: def f(s):
return (x for x in s) currently generates the following bytecode:
I think it should generate:
This is a behavior change, but I think a correct one. If |
Agree.
Could you please give some additional information about this behavior change? |
Currently, this happens: >>> (x for x in 1)
Traceback (most recent call last):
File "<python-input-0>", line 1, in <module>
(x for x in 1)
^
TypeError: 'int' object is not iterable With Mark's proposed change, this error would be thrown only when the generator is iterated over. I think either behavior is fine and I agree with Mark that the proposed change makes for a more elegant implementation, but it is a change in core language behavior that I could see impacting some users (e.g., if you put a try-except around the construction of the genexp), so I wouldn't feel comfortable making that change in a bugfix release. @markshannon what do you think of applying this PR's approach in the 3.13 branch, and your idea on main only? |
Understood. Thx for the clarification.
It looks like a good idea for me, since crash through f_locals modification is already fixed. There will be no problem in backports in the case of two separate PRs? |
My proposed fix should have no performance impact (it does no extra work), so I'd definitely prefer that. If you think that is too intrusive for 3.13, we could leave the |
I like the idea of new CHECK_ITER bytecode. What do you think about this plan:
And what about Python 3.12 and below? |
That sounds like a good plan. If the fix is simple enough, then we can backport to 3.12. |
New tests are moved back to test_generators.py. Tests on generator creation via FunctionType from gi_code are added.
I'd prefer (3). But (2) is fine too. What do you think, @markshannon, @JelleZijlstra, @sobolevn ? |
As @JelleZijlstra points out, bumping the magic number could cause problems in a backport. Given that, maybe the proposed compiler change moving the So, which is the best option?
Personally, I think 3 is best. |
In my opinion this can be qualified as a feature, rather than a bug. No real users ever complained about this problem (@efimov-mikhail confirmed that this was found by his attempts to specifically learn about the generator internals). While technically possible to cause a crash, in practice - it does not. When messing around with undocumented internals - crashes can happen. So, I feel like the safest way here is to fix the problem with the new bytecode (current PR state) and just do not backport this at all. |
As @JelleZijlstra points out, For 3.14, we can then drop the |
It seems like this solution will be best tradeoff. |
c5d5bf4
to
890b936
Compare
This might introduce a small slowdown in 3.13, but it isn't in the loop, so shouldn't be an issue. I'm not worried about that though, as we can remove the other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good me once you've agreed on a better name for ModifyTest.
It's okay for me to change name from "ModifyTest" to something. |
Either of those names works for me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
It seems we have agreed. Anything else, or it could be merged? |
Thanks @efimov-mikhail for the PR, and @JelleZijlstra for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13. |
Sorry, @efimov-mikhail and @JelleZijlstra, I could not cleanly backport this to
|
…ipulations (pythonGH-125178) (cherry picked from commit 079875e) Co-authored-by: Mikhail Efimov <[email protected]>
GH-125846 is a backport of this pull request to the 3.13 branch. |
…ions (GH-125178) (#125846) (cherry picked from commit 079875e) Co-authored-by: Mikhail Efimov <[email protected]>
Some iterator checks are added for _FOR_ITER, _FOR_ITER_TIER_TWO and INSTRUMENTED_FOR_ITER bytecode implementations. TypeError is raised in case of tp_iternext == NULL. Tests on generator modifying through gi_frame.f_locals are added, both to genexpr generators and function generators.
I've added some test cases to emphasize current code behavior and save it explicitly in tests.
IMHO, there is still a room to improvement.
Previous PR becomes incorrect (#125051) after my typo in git commands.
I'm sorry about that.
@JelleZijlstra, Could you please review this again?
Moreover, label "needs backport to 3.13" should be added once again.