Better to/from slice operations #571

stepancheg · 2024-10-20T23:42:35Z

Check slice has enough capacity only once.

machine code is smaller
only one branch for non-failing cases

bitshifter · 2024-10-21T00:10:08Z

codegen/templates/vec.rs.tera

@@ -483,6 +483,7 @@ impl {{ self_t }} {
    #[inline]
    #[must_use]
    pub const fn from_slice(slice: &[{{ scalar_t }}]) -> Self {
+        assert!(slice.len() >= {{ dim }});


accessing slice by index will also check that the index is in range, are you suggesting that adding this here makes the compiler elide the other changes, is this one of the operations that you were looking at generated code for?

Yes! Range check is performed only once now. For this code compiler output before:

example::from_slice_slow::hd394bd4d83252b04: push rax test rdx, rdx je .LBB2_4 cmp rdx, 1 je .LBB2_5 cmp rdx, 2 jbe .LBB2_3 movss xmm0, dword ptr [rsi] movss dword ptr [rdi], xmm0 movsd xmm0, qword ptr [rsi + 4] movsd qword ptr [rdi + 4], xmm0 mov rax, rdi pop rcx ret ...

after:

example::from_slice_fast::h47cd007525b38d49: cmp rdx, 2 jbe .LBB3_2 movss xmm0, dword ptr [rsi + 8] movsd xmm1, qword ptr [rsi] movsd qword ptr [rdi], xmm1 movss dword ptr [rdi + 8], xmm0 mov rax, rdi ret ...

Sorry, I just noticed the godbolt link.

Godbolt might not tell the full story though as there are different implementations of glam vectors, using simd in some cases and floats in others. It's better to look at the generated assembly. You can use cargo asm, because most things are inlined it's usually necessary to add public test functions that call glam functions and look at the asm for those.

This PR does not change code for SIMD versions, only code for working with regular slice.

Even if code is inlined, the same branches stay, because it is observable behavior:

different error messages before this PR for different indices

on write to slice, data was partially written before panic

Anyway, looking at cargo asm glam::test_xxx for Vec3A on my Apple laptop.

Code:

pub fn test_write(a: Vec3A, target: &mut [f32]) { a.write_to_slice(target); } pub fn test_read(a: &[f32]) -> Vec3A { Vec3A::from_slice(a) }

Before:

glam::test_write: stp x29, x30, [sp, #-16]! mov x29, sp cbz x2, LBB98_4 ldr q0, [x0] str s0, [x1] cmp x2, #1 b.eq LBB98_5 add x8, x1, #4 st1.s {, v0, }[1], [x8] cmp x2, #2 b.ls LBB98_6 add x8, x1, #8 st1.s {, v0, }[2], [x8] ldp x29, x30, [sp], #16 ret LBB98_4: Lloh662: adrp x2, l___unnamed_75@PAGE Lloh663: add x2, x2, l___unnamed_75@PAGEOFF mov x0, #0 mov x1, #0 bl __ZN4core9panicking18panic_bounds_check17h47e687fd4a18b35fE LBB98_5: Lloh664: adrp x2, l___unnamed_76@PAGE Lloh665: add x2, x2, l___unnamed_76@PAGEOFF mov w0, #1 mov w1, #1 bl __ZN4core9panicking18panic_bounds_check17h47e687fd4a18b35fE LBB98_6: Lloh666: adrp x2, l___unnamed_77@PAGE Lloh667: add x2, x2, l___unnamed_77@PAGEOFF mov w0, #2 mov w1, #2 bl __ZN4core9panicking18panic_bounds_check17h47e687fd4a18b35fE glam::test_read: stp x29, x30, [sp, #-16]! mov x29, sp cbz x1, LBB99_4 cmp x1, #1 b.eq LBB99_5 cmp x1, #2 b.ls LBB99_6 ldr s0, [x0] add x9, x0, #4 ld1.s {, v0, }[1], [x9] ldr s1, [x0, #8] mov.s v0[2], v1[0] mov.s v0[3], v1[0] str q0, [x8] ldp x29, x30, [sp], #16 ret LBB99_4: Lloh668: adrp x2, l___unnamed_78@PAGE Lloh669: add x2, x2, l___unnamed_78@PAGEOFF mov x0, #0 bl __ZN4core9panicking18panic_bounds_check17h47e687fd4a18b35fE LBB99_5: Lloh670: adrp x2, l___unnamed_79@PAGE Lloh671: add x2, x2, l___unnamed_79@PAGEOFF mov w0, #1 bl __ZN4core9panicking18panic_bounds_check17h47e687fd4a18b35fE LBB99_6: Lloh672: adrp x2, l___unnamed_80@PAGE Lloh673: add x2, x2, l___unnamed_80@PAGEOFF mov w0, #2 mov w1, #2 bl __ZN4core9panicking18panic_bounds_check17h47e687fd4a18b35fE

After:

glam::test_write: cmp x2, #3 b.ne LBB98_2 ldr q0, [x0] add x8, x1, #8 st1.s {, v0, }[2], [x8] str d0, [x1] ret LBB98_2: stp x29, x30, [sp, #-16]! mov x29, sp Lloh662: adrp x8, l___unnamed_75@PAGE Lloh663: add x8, x8, l___unnamed_75@PAGEOFF mov x0, x2 mov w1, #3 mov x2, x8 bl __ZN4core5slice29_$LT$impl$u20$$u5b$T$u5d$$GT$15copy_from_slice17len_mismatch_fail17h7212f66e1542c208E glam::test_read: cmp x1, #2 b.ls LBB99_2 ldr d0, [x0] ldr s1, [x0, #8] mov.s v0[2], v1[0] mov.s v0[3], v1[0] str q0, [x8] ret LBB99_2: stp x29, x30, [sp, #-16]! mov x29, sp Lloh664: adrp x0, l___unnamed_76@PAGE Lloh665: add x0, x0, l___unnamed_76@PAGEOFF Lloh666: adrp x2, l___unnamed_77@PAGE Lloh667: add x2, x2, l___unnamed_77@PAGEOFF mov w1, #34 bl __ZN4core9panicking5panic17hae814ca9668aac2fE

Thanks for checking. I was mostly wondering if to_array resulted in anything weird with underlying simd types since it will be dereferencing all the vector elements, seems fine though.

Fix regression introduced in bitshifter#571.

Fix regression introduced in bitshifter#571. `VecN::write_to_slice` expects the slice length to be >= dim, not == dim. `slice::copy_from_slice` expects both slices to have the same length.

Fix regression introduced in #571. `VecN::write_to_slice` expects the slice length to be >= dim, not == dim. `slice::copy_from_slice` expects both slices to have the same length.

Better to/from slice operations

02d0a3b

bitshifter reviewed Oct 21, 2024

View reviewed changes

bitshifter merged commit 30c6b0d into bitshifter:main Oct 21, 2024
18 checks passed

stepancheg added a commit to stepancheg/glam-rs that referenced this pull request Nov 5, 2024

Fix VecN::write_to_slice

ff6d581

Fix regression introduced in bitshifter#571.

stepancheg mentioned this pull request Nov 5, 2024

Fix VecN::write_to_slice #581

Merged

bitshifter pushed a commit that referenced this pull request Nov 5, 2024

Fix VecN::write_to_slice (#581)

d0aa9a9

Fix regression introduced in #571. `VecN::write_to_slice` expects the slice length to be >= dim, not == dim. `slice::copy_from_slice` expects both slices to have the same length.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better to/from slice operations #571

Better to/from slice operations #571

stepancheg commented Oct 20, 2024

bitshifter Oct 21, 2024

stepancheg Oct 21, 2024

bitshifter Oct 21, 2024

stepancheg Oct 21, 2024 •

edited

Loading

bitshifter Oct 21, 2024

Better to/from slice operations #571

Better to/from slice operations #571

Conversation

stepancheg commented Oct 20, 2024

bitshifter Oct 21, 2024

Choose a reason for hiding this comment

stepancheg Oct 21, 2024

Choose a reason for hiding this comment

bitshifter Oct 21, 2024

Choose a reason for hiding this comment

stepancheg Oct 21, 2024 • edited Loading

Choose a reason for hiding this comment

bitshifter Oct 21, 2024

Choose a reason for hiding this comment

stepancheg Oct 21, 2024 •

edited

Loading