Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

soundness with async copies #340

Open
jedbrown opened this issue Mar 1, 2025 · 1 comment
Open

soundness with async copies #340

jedbrown opened this issue Mar 1, 2025 · 1 comment

Comments

@jedbrown
Copy link

jedbrown commented Mar 1, 2025

The non-sync copy interfaces do not hold onto the host memory, so subsequent modification of the host memory can race with the DMA transfer to device. This is an example with htod_copy_pinned #336. Small sizes will spuriously pass this test due to what looks like eager copying, but N=10000 here fails every time in my tests (cuda-12.8, sm_89).

diff --git i/src/driver/safe/alloc.rs w/src/driver/safe/alloc.rs
index 0271a4a..014a3df 100644
--- i/src/driver/safe/alloc.rs
+++ w/src/driver/safe/alloc.rs
@@ -619,12 +619,17 @@ mod tests {
 
     #[test]
     fn test_htod_copy_pinned() {
-        let truth = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0];
+        const N: usize = 10000;
+        let truth = [1.0; N];
         let dev = CudaDevice::new(0).unwrap();
-        let mut pinned = unsafe { dev.alloc_pinned::<f32>(10) }.unwrap();
+        let mut pinned = unsafe { dev.alloc_pinned::<f32>(N) }.unwrap();
         pinned.as_mut_slice().clone_from_slice(&truth);
         assert_eq!(pinned.as_slice(), &truth);
         let dst = dev.htod_copy_pinned(&pinned).unwrap();
+        {
+            let p = pinned.as_mut_slice();
+            p[p.len() - 1] += 1.0;
+        }
         let host = dev.dtoh_sync_copy(&dst).unwrap();
         assert_eq!(&host, &truth);
     }

I don't want to imply extra synchrony or copying or prevent reusing buffers. One tempting change would be for such copies to return a future/request that retains the borrow (which must wait/observe that the operation completed). The trouble is that one can mem::forget(request) to release the lifetime without completing the request, then race as above. I don't have a solution for this other than moving ownership of the host memory into the request, returning it at completion. Unfortunately, that's less ergonomic. That wouldn't be necessary if Rust were to add support for linear types.

@serhii-havrylov
Copy link

The methods as_mut_slice and as_slice should at least be marked as unsafe for the same reasons. (c6d1b6e#r153160982)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants