You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The non-sync copy interfaces do not hold onto the host memory, so subsequent modification of the host memory can race with the DMA transfer to device. This is an example with htod_copy_pinned#336. Small sizes will spuriously pass this test due to what looks like eager copying, but N=10000 here fails every time in my tests (cuda-12.8, sm_89).
diff --git i/src/driver/safe/alloc.rs w/src/driver/safe/alloc.rs
index 0271a4a..014a3df 100644
--- i/src/driver/safe/alloc.rs+++ w/src/driver/safe/alloc.rs@@ -619,12 +619,17 @@ mod tests {
#[test]
fn test_htod_copy_pinned() {
- let truth = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0];+ const N: usize = 10000;+ let truth = [1.0; N];
let dev = CudaDevice::new(0).unwrap();
- let mut pinned = unsafe { dev.alloc_pinned::<f32>(10) }.unwrap();+ let mut pinned = unsafe { dev.alloc_pinned::<f32>(N) }.unwrap();
pinned.as_mut_slice().clone_from_slice(&truth);
assert_eq!(pinned.as_slice(), &truth);
let dst = dev.htod_copy_pinned(&pinned).unwrap();
+ {+ let p = pinned.as_mut_slice();+ p[p.len() - 1] += 1.0;+ }
let host = dev.dtoh_sync_copy(&dst).unwrap();
assert_eq!(&host, &truth);
}
I don't want to imply extra synchrony or copying or prevent reusing buffers. One tempting change would be for such copies to return a future/request that retains the borrow (which must wait/observe that the operation completed). The trouble is that one can mem::forget(request) to release the lifetime without completing the request, then race as above. I don't have a solution for this other than moving ownership of the host memory into the request, returning it at completion. Unfortunately, that's less ergonomic. That wouldn't be necessary if Rust were to add support for linear types.
The text was updated successfully, but these errors were encountered:
The non-
sync
copy interfaces do not hold onto the host memory, so subsequent modification of the host memory can race with the DMA transfer to device. This is an example withhtod_copy_pinned
#336. Small sizes will spuriously pass this test due to what looks like eager copying, but N=10000 here fails every time in my tests (cuda-12.8, sm_89).I don't want to imply extra synchrony or copying or prevent reusing buffers. One tempting change would be for such copies to return a future/request that retains the borrow (which must wait/observe that the operation completed). The trouble is that one can
mem::forget(request)
to release the lifetime without completing the request, then race as above. I don't have a solution for this other than moving ownership of the host memory into the request, returning it at completion. Unfortunately, that's less ergonomic. That wouldn't be necessary if Rust were to add support for linear types.The text was updated successfully, but these errors were encountered: