-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Remove duplicated content in error messages #8107
fix: Remove duplicated content in error messages #8107
Conversation
Any news on this @rben01? |
Thanks for the reminder. I'll circle back on this this week and if nothing else write some tests. But first I'd like to see if there's a better way to include multiple error messages than just the depth. |
@ritchie46 I'm thinking about implementing a system where subsequent operations are tracked in some detail — not just the number of them, but an entire |
So I think this is fine to merge in its current state. It's better than the current behavior. I'll think more about what improving the error messages might look like in the future, but for now this PR is a strict improvement on the existing behavior. |
@rben01 , if the maintainers of polars like your PR as much as I do: could you update it with the main branch and add a test? This can be based on your example with an added assertion that the contents looks correct. |
I pulled in |
Ok, test added, should be ok to merge now. |
Can you ensure CI is green? :) |
Nothing I can do about python but rust CI should be all green now :) |
Hi @rben01 , I think it is because the way the errors are now handled. What if you add in your tests something like this: let df4 = df
.clone()
.select([col("c1"), col("c1")])
.collect(); That should give you the So I think it has to do with the "compression"/"unwrapping" that you implemented. |
Thanks @svaningelgem , you're absolutely right. I changed things so that instead of wrapping everything in
And now we're all green. |
Updated again to simplify the implementation and remove some of the pointless stuff in debug output. Hopefully this remains all green. > initial df
= DF ["c1"]; PROJECT */1 COLUMNS; SELECTION: "None"
# NO ERROR
> select xyz, then 0 ops after
= ErrorState { n_times: 0, err: ColumnNotFound(ErrString("xyz\n\nError originated just after this operation:\nDF [\"c1\"]; PROJECT */1 COLUMNS; SELECTION: \"None\"")) }
# ColumnNotFound(ErrString("xyz\n\nError originated just after this operation:\nDF [\"c1\"]; PROJECT */1 COLUMNS; SELECTION: \"None\""))
> select xyz, then 1 ops after
= ErrorState { n_times: 1, err: ColumnNotFound(ErrString("xyz\n\nError originated just after this operation:\nDF [\"c1\"]; PROJECT */1 COLUMNS; SELECTION: \"None\"")) }
# ColumnNotFound(ErrString("LogicalPlan already failed (depth: 1) with error: 'xyz\n\nError originated just after this operation:\nDF [\"c1\"]; PROJECT */1 COLUMNS; SELECTION: \"None\"'"))
> select xyz, then 2 ops after
= ErrorState { n_times: 2, err: ColumnNotFound(ErrString("xyz\n\nError originated just after this operation:\nDF [\"c1\"]; PROJECT */1 COLUMNS; SELECTION: \"None\"")) }
# ColumnNotFound(ErrString("LogicalPlan already failed (depth: 2) with error: 'xyz\n\nError originated just after this operation:\nDF [\"c1\"]; PROJECT */1 COLUMNS; SELECTION: \"None\"'"))
> select xyz, then 3 ops after
= ErrorState { n_times: 3, err: ColumnNotFound(ErrString("xyz\n\nError originated just after this operation:\nDF [\"c1\"]; PROJECT */1 COLUMNS; SELECTION: \"None\"")) }
# ColumnNotFound(ErrString("LogicalPlan already failed (depth: 3) with error: 'xyz\n\nError originated just after this operation:\nDF [\"c1\"]; PROJECT */1 COLUMNS; SELECTION: \"None\"'")) |
@ritchie46 does this look good to you? |
@ritchie46 Any update on merging this? |
@rben01 could you rebase this? Then I'll have a look or ask Ritchie to look at it. Also, the linked issue has been closed. Is there a MRE we can run to see the fix in action? Adding a Python test with such an example would be preferable. |
Prevented exponential growth of error messages by avoiding nesting of error messages Instead, used a counter to track how many operations had been applied to a LazyFrame after an error Renamed `ErrorState` to `ErrorStateEncounters` and made it crate-private Renamed `ErrorStateSync` to `ErrorState` and removed its Deref impl so that the only way to access its data is through (crate-private) accessors, thus reducing the size of its public API
…slashes in output Minor cleanup + commenting
Prevented exponential growth of error messages by avoiding nesting of error messages Instead, used a counter to track how many operations had been applied to a LazyFrame after an error Renamed `ErrorState` to `ErrorStateEncounters` and made it crate-private Renamed `ErrorStateSync` to `ErrorState` and removed its Deref impl so that the only way to access its data is through (crate-private) accessors, thus reducing the size of its public API
Instead of leaving a `String` in place of `PolarsError`, leave in place the original `PolarsError`. Then, when it's time to produce a new error, we clone the original error if possible; and it's almost always possible, as all `PolarsError` variants contain ErrString as their data except one: `Io`. So just convert `Io` to `ComputeError` with an error message saying so, otherwise reuse the `PolarsError` variant with a cloned message. (This is based off, and utilizes, the method `PolarsError::wrap_msg`.) This way we can retain the "shape" of the original error and just augment its message to indicate that it's already failed n times. Also cleaned up the `err_msg.rs` test in anticipation of adding a bit more to that file.
Now that errors aren't taken and replaced by a String - the original is stored and cloned as needed - we can simplify the implementation. Instead of NotYetEncountered and AlreadyEncountered, we can just have a single struct with a counteer (no enum needed). This massively simplifies the implementation Also, simplified the implemtnation of tests Also cleaned up ErrorState's Debug implementation so that users don't see the nuts and bolts inside the struct - they only see the important debugging info
f801640
to
7058c89
Compare
I closed it with a smaller PR that prevented the escaping of backslashes leading to O(2^n) error message length, but there was still duplication within error messages, as all old messages were printed — that's what this PR fixes. Here is an example from Python of compounding error messages (edited after posting to replace In [10]: pl.__version__
Out[10]: '0.20.7'
In [11]: df = pl.DataFrame({'a': 1, 'b': range(2)})
In [12]: df.lazy().select('c').select('d').select('e')
Out[12]: ---------------------------------------------------------------------------
ColumnNotFoundError Traceback (most recent call last)
<<< backtrace omitted >>>
ColumnNotFoundError: c
Error originated just after this operation:
DF ["a", "b"]; PROJECT */2 COLUMNS; SELECTION: "None"
Error originated just after this operation:
ErrorStateSync(AlreadyEncountered(not found: c
Error originated just after this operation:
DF ["a", "b"]; PROJECT */2 COLUMNS; SELECTION: "None"))
DF ["a", "b"]; PROJECT */2 COLUMNS; SELECTION: "None"
Error originated just after this operation:
ErrorStateSync(AlreadyEncountered(not found: c
Error originated just after this operation:
DF ["a", "b"]; PROJECT */2 COLUMNS; SELECTION: "None"
Error originated just after this operation:
ErrorStateSync(AlreadyEncountered(not found: c
Error originated just after this operation:
DF ["a", "b"]; PROJECT */2 COLUMNS; SELECTION: "None"))
DF ["a", "b"]; PROJECT */2 COLUMNS; SELECTION: "None"))
ErrorStateSync(AlreadyEncountered(not found: c
Error originated just after this operation:
DF ["a", "b"]; PROJECT */2 COLUMNS; SELECTION: "None"))
DF ["a", "b"]; PROJECT */2 COLUMNS; SELECTION: "None" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test for this.
However, if we look at the resulting exception message now, it's not quite correct:
import polars as pl
lf = pl.LazyFrame({"a": 1, "b": range(2)})
lf.select("c").select("d").select("e").collect()
Traceback (most recent call last):
File "/home/stijn/code/polars/py-polars/repro.py", line 29, in <module>
df.select("c").select("d").select("e").collect()
File "/home/stijn/code/polars/py-polars/polars/lazyframe/frame.py", line 1927, in collect
return wrap_df(ldf.collect())
^^^^^^^^^^^^^
polars.exceptions.ColumnNotFoundError: LogicalPlan already failed (depth: 2) with error: 'c
Error originated just after this operation:
DF ["a", "b"]; PROJECT */2 COLUMNS; SELECTION: "None"'
Two things:
- The quote around the
c
column name ends up at the end (e.g. we have'c
now with the last quote at the end of this message. - I would expect a simple
ColumnNotFound: 'c'
- not something about LogicalPlan already failed.
Oh, I see. Looks like the Rust error messages have been specially crafted so they look nice in Python. The error is not On the Rust side, the current (v0.37.0) error message looks like this:
When it makes it to Python, I guess everything but the message itself is stripped off and converted to a |
…e display of Python error messages
So now, the error messages should look like:
I wasn't able to test in Python; I'm just going off my understanding of how the error messages on the rust side are translated into Python. |
If @stinodego is happy with the error messages, this can be merged. Thanks a lot for this @rben01 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks!
Should fix #8079 , but please avoid merging for now, as I have only tested a small subset of the things this might affect (and I have not yet added any formal tests at all; what's below is the extent of my testing). I would also like input on this to see if this the ideal way to handle this problem and to report errors in general.
Changes
ErrorState
toErrorStateEncounters
and made it crate-privateErrorStateSync
toErrorState
and removed itsDeref
impl so that the only way to access its data is through (crate-private) accessors, thus reducing the size of its public APIError messages are now O(1) in length (well, technically O(log(n)), if you take into account that the depth counter has log10(n) digits, but this is effectively O(1)).
Test case (singular)
Input
In each case we print
> title of test case
,= df.describe_plan()
, and# df.collect().unwrap_err()
Output
In each case we print
> title of test case
,= df.describe_plan()
, and# df.collect().unwrap_err()