Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(rust): Prevent panic on sample_n with replacement from empty df #10731

Merged
merged 2 commits into from
Aug 26, 2023

Conversation

trueb2
Copy link
Contributor

@trueb2 trueb2 commented Aug 25, 2023

Sampling n from an empty dataframe panics if with_replacement=True. A test case is added documenting the behavior of sample_n and sample_frac for with_replacement and without.

I am not really sure why ensure_shape is used without symmetry, but it appears intentional.

let subdf = df.sample_n(50000, true, false, None)?;
shape: (0, 4)
┌──────────────┬─────┬─────┬─────┐
│ t            ┆ x   ┆ y   ┆ z   │
│ ---          ┆ --- ┆ --- ┆ --- │
│ datetime[μs] ┆ f64 ┆ f64 ┆ f64 │
╞══════════════╪═════╪═════╪═════╡
└──────────────┴─────┴─────┴─────┘

thread 'tokio-runtime-worker' panicked at 'Uniform::new called with `low >= high`', /Users/jwtrueb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rand-0.8.5/src/distributions/uniform.rs:567:1
   0: rust_begin_unwind
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/panicking.rs:593:5
   1: core::panicking::panic_fmt
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/panicking.rs:67:14
   2: core::panicking::panic
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/panicking.rs:117:5
   3: <rand::distributions::uniform::UniformInt<u32> as rand::distributions::uniform::UniformSampler>::new
             at /Users/jwtrueb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rand-0.8.5/src/distributions/uniform.rs:452:17
   4: rand::distributions::uniform::Uniform<X>::new
             at /Users/jwtrueb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rand-0.8.5/src/distributions/uniform.rs:189:17
   5: polars_core::chunked_array::random::create_rand_index_with_replacement
             at /Users/jwtrueb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-core-0.32.1/src/chunked_array/random.rs:18:16
   6: polars_core::chunked_array::random::<impl polars_core::frame::DataFrame>::sample_n
             at /Users/jwtrueb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/polars-core-0.32.1/src/chunked_array/random.rs:177:21
   7: qsib_data_viz::sample_data::{{closure}}::{{closure}}
             at ./src/main.rs:306:25

@github-actions github-actions bot added fix Bug fix rust Related to Rust Polars labels Aug 25, 2023
Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix. I have left a few minor comments.

@@ -9,6 +9,9 @@ use crate::random::get_global_random_u64;
use crate::utils::{CustomIterTools, NoNull};

fn create_rand_index_with_replacement(n: usize, len: usize, seed: Option<u64>) -> IdxCa {
if len == 0 {
return NoNull::<IdxCa>::from_iter(std::iter::empty()).into_inner();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use IdxCa::new_empty

crates/polars-core/src/chunked_array/random.rs Outdated Show resolved Hide resolved
@trueb2
Copy link
Contributor Author

trueb2 commented Aug 26, 2023

I added the test to py-polars and confirmed that without the fix we hit a panic

E       pyo3_runtime.PanicException: Uniform::new called with `low >= high`

I do not think there is a way to use IdxCa::new_empty. IdxCa is a ChunkedArray and the only new_empty is on Series or an ObjectChunked. I am updating to something more concise IdxCa::new_vec("", vec![]);

@trueb2
Copy link
Contributor Author

trueb2 commented Aug 26, 2023

Nice CI and make setup! This was pretty streamlined and intuitive for making contributions for the first time to this project!

@ritchie46
Copy link
Member

ritchie46 commented Aug 26, 2023

Nice CI and make setup! This was pretty streamlined and intuitive for making contributions for the first time to this project!

Nice! Honours to @stinodego on this one. ;) And thanks a lot for the fix!

@ritchie46 ritchie46 merged commit be38947 into pola-rs:main Aug 26, 2023
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants