Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post-join panic on select after joining on keys with different names with "coalesce=False" #16515

Closed
2 tasks done
alexander-beedie opened this issue May 27, 2024 · 0 comments · Fixed by #16541
Closed
2 tasks done
Assignees
Labels
accepted Ready for implementation bug Something isn't working P-high Priority: high python Related to Python Polars rust Related to Rust Polars

Comments

@alexander-beedie
Copy link
Collaborator

alexander-beedie commented May 27, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

Python:

(DataFrame API)

import polars as pl

lf1 = pl.LazyFrame({"x": [-1, 0, 1, 2, 3, 4]})
lf2 = pl.LazyFrame({"y": [0, 1, -2, 3, 5, 6]})
lf3 = lf1.join(lf2, how="inner", left_on="x", right_on="y", coalesce=False)

lf3.collect()
# shape: (3, 2)
# ┌─────┬─────┐
# │ x   ┆ y   │  << definitely have column "y"
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 0   ┆ 0   │
# │ 1   ┆ 1   │
# │ 3   ┆ 3   │
# └─────┴─────┘

lf3.select("y").collect()
# pyo3_runtime.PanicException: called `Result::unwrap()` 
#  on an `Err` value: ColumnNotFound(ErrString("y"))

Rust:

(SQL context, if on this PR's branch #16507; also happens with regular API)

let df1 = df! {"x" => [-1, 0, 1, 2, 3, 4]}.unwrap();
let df2 = df! {"y" => [0, 1, -2, 3, 5, 6]}.unwrap();

let mut ctx = SQLContext::new();
ctx.register("df1", df1.lazy());
ctx.register("df2", df2.lazy());

// join on x = y
let sql = r#"
    SELECT df2.*
    FROM df1
    INNER JOIN df2 ON df1.x = df2.y
    ORDER BY y
"#;
let df3 = ctx.execute(sql).unwrap().collect().unwrap();

Log output

thread '...' panicked at crates/polars-plan/src/utils.rs:371:79:
called `Result::unwrap()` on an `Err` value: ColumnNotFound(ErrString("y"))
stack backtrace:
   0: rust_begin_unwind
             at /rustc/ab14f944afe4234db378ced3801e637eae6c0f30/library/std/src/panicking.rs:652:5
   1: core::panicking::panic_fmt
             at /rustc/ab14f944afe4234db378ced3801e637eae6c0f30/library/core/src/panicking.rs:72:14
   2: core::result::unwrap_failed
             at /rustc/ab14f944afe4234db378ced3801e637eae6c0f30/library/core/src/result.rs:1654:5
   3: core::result::Result<T,E>::unwrap
             at /rustc/ab14f944afe4234db378ced3801e637eae6c0f30/library/core/src/result.rs:1077:23
   4: polars_plan::utils::expr_irs_to_schema::{{closure}}
             at /Users/.../polars/crates/polars-plan/src/utils.rs:371:29
   5: core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &mut F>::call_once
             at /rustc/ab14f944afe4234db378ced3801e637eae6c0f30/library/core/src/ops/function.rs:305:13
   6: core::option::Option<T>::map
             at /rustc/ab14f944afe4234db378ced3801e637eae6c0f30/library/core/src/option.rs:1072:29
   7: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::next
             at /rustc/ab14f944afe4234db378ced3801e637eae6c0f30/library/core/src/iter/adapters/map.rs:108:26
   8: <polars_core::schema::Schema as core::iter::traits::collect::FromIterator<F>>::from_iter
             at /Users/.../polars/crates/polars-core/src/schema.rs:59:20
   9: core::iter::traits::iterator::Iterator::collect
             at /rustc/ab14f944afe4234db378ced3801e637eae6c0f30/library/core/src/iter/traits/iterator.rs:2005:9
  10: polars_plan::utils::expr_irs_to_schema
             at /Users/.../polars/crates/polars-plan/src/utils.rs:368:5
  11: polars_plan::logical_plan::builder_ir::IRBuilder::project
             at /Users/.../polars/crates/polars-plan/src/logical_plan/builder_ir.rs:50:17
  12: polars_plan::logical_plan::optimizer::projection_pushdown::ProjectionPushDown::finish_node
             at /Users/.../polars/crates/polars-plan/src/logical_plan/optimizer/projection_pushdown/mod.rs:206:13
  13: polars_plan::logical_plan::optimizer::projection_pushdown::projection::process_projection
             at /Users/.../polars/crates/polars-plan/src/logical_plan/optimizer/projection_pushdown/projection.rs:125:14
  14: polars_plan::logical_plan::optimizer::projection_pushdown::ProjectionPushDown::push_down::{{closure}}
             at /Users/.../polars/crates/polars-plan/src/logical_plan/optimizer/projection_pushdown/mod.rs:330:43
  15: stacker::maybe_grow
             at /Users/a13107q/.cargo/registry/src/index.crates.io-6f17d22bba15001f/stacker-0.1.15/src/lib.rs:55:9
  16: polars_plan::logical_plan::optimizer::projection_pushdown::ProjectionPushDown::push_down
             at /Users/.../polars/crates/polars-plan/src/logical_plan/optimizer/projection_pushdown/mod.rs:315:5
  17: polars_plan::logical_plan::optimizer::projection_pushdown::ProjectionPushDown::optimize
             at /Users/.../polars/crates/polars-plan/src/logical_plan/optimizer/projection_pushdown/mod.rs:723:9
  18: polars_plan::logical_plan::optimizer::optimize
             at /Users/.../polars/crates/polars-plan/src/logical_plan/optimizer/mod.rs:144:19
  19: polars_lazy::frame::LazyFrame::optimize_with_scratch
             at /Users/.../polars/crates/polars-lazy/src/frame/mod.rs:549:22
  20: polars_lazy::frame::LazyFrame::prepare_collect_post_opt
             at /Users/.../polars/crates/polars-lazy/src/frame/mod.rs:602:13
  21: polars_lazy::frame::LazyFrame::_collect_post_opt
             at /Users/.../polars/crates/polars-lazy/src/frame/mod.rs:623:49
  22: polars_lazy::frame::LazyFrame::collect
             at /Users/.../polars/crates/polars-lazy/src/frame/mod.rs:653:9
  23: polars_sql::context::SQLContext::execute_select
             at ./src/context.rs:455:14
  24: polars_sql::context::SQLContext::process_set_expr
             at ./src/context.rs:205:45
  25: polars_sql::context::SQLContext::execute_query_no_ctes
             at ./src/context.rs:173:18
  26: polars_sql::context::SQLContext::execute_query
             at ./src/context.rs:169:9
  27: polars_sql::context::SQLContext::execute_statement
             at ./src/context.rs:151:40
  28: polars_sql::context::SQLContext::execute
             at ./src/context.rs:119:19
  29: statements::test_join_on_different_keys
             at ./tests/statements.rs:512:18
  30: statements::test_join_on_different_keys::{{closure}}
             at ./tests/statements.rs:497:33
  31: core::ops::function::FnOnce::call_once
             at /rustc/ab14f944afe4234db378ced3801e637eae6c0f30/library/core/src/ops/function.rs:250:5
  32: core::ops::function::FnOnce::call_once
             at /rustc/ab14f944afe4234db378ced3801e637eae6c0f30/library/core/src/ops/function.rs:250:5

Issue description

Selecting a valid column after a join where "coalesce=False" can result in a panic.

Expected behavior

Return the following frame:

# shape: (3, 1)
# ┌─────┐
# │ y   │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 0   │
# │ 1   │
# │ 3   │
# └─────┘

Installed versions

Compiled from latest main with all standard features.
Also confirmed on latest 0.20.30 release build.

@alexander-beedie alexander-beedie added bug Something isn't working rust Related to Rust Polars needs triage Awaiting prioritization by a maintainer labels May 27, 2024
@alexander-beedie alexander-beedie changed the title Post-join PanicException on select when "coalesce=False" Post-join panic on select when "coalesce=False" May 27, 2024
@alexander-beedie alexander-beedie changed the title Post-join panic on select when "coalesce=False" Post-join panic on select when joining on keys with different names May 27, 2024
@alexander-beedie alexander-beedie added the P-high Priority: high label May 27, 2024
@github-project-automation github-project-automation bot moved this to Ready in Backlog May 27, 2024
@alexander-beedie alexander-beedie changed the title Post-join panic on select when joining on keys with different names Post-join panic on select after joining on keys with different names May 27, 2024
@alexander-beedie alexander-beedie changed the title Post-join panic on select after joining on keys with different names Post-join panic on select after joining on keys with different names with "coalesce=False" May 27, 2024
@stinodego stinodego added python Related to Python Polars and removed needs triage Awaiting prioritization by a maintainer labels May 27, 2024
@github-project-automation github-project-automation bot moved this from Ready to Done in Backlog May 28, 2024
@c-peters c-peters added the accepted Ready for implementation label Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation bug Something isn't working P-high Priority: high python Related to Python Polars rust Related to Rust Polars
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants