Add convenience wrapper for disk uploads to SDK #840

wfchandler · 2024-09-16T14:46:07Z

From a user perspective uploading a disk is a single operation, but doing so via the SDK is a very involved process, involving stringing together multiple API calls with any interruption leaving the disk in a state that cannot be directly removed.

Add a new extras module to the SDK that will contain convenience wrappers around actions that functions that are overly complicated. Create a disk_import function in extras that will perform a disk upload as a single step, and attempts to cleanup partially imported disks on failure. We make the disk upload CLI command a thin wrapper around the new SDK function, responsible only for cancellation and progress. We manually recreate the builder interface, as creating a procmacro to do this would require creating a new crate. If/when extras continues to grow we should reconsider this.

disk_import also give callers access to a DiskImportHandle with a watch channel with the current number of bytes uploaded for use in progress tracking, and provides a cancel function that allows users to gracefully stop the disk upload. A disk that has been successfully created will not be removed, only those that are partially imported.

The cancel function uses tokio::select to cancel the Future of the upload task. This should be safe in our in our case no locks are held across await points. No operations need to be safe to restart as we only call tokio::select a single time, they will be dropped immediately and losing a message is not a problem.

We are forced to remove tests test_disk_import_bulk_import_start_fail and test_disk_import_bulk_write_import_fail as we now poll for disks in Created state, which may happen if a user immediately cancels a new request. This state is not eligible to be finalized, so we need to wait for the disk to progress to the next state, but HttpMock does not provide a way to programatically delete or update a mock after sending n responses. This means that the disk will always be returned as non-existing when querying, and cleanup will be skipped. I have opened alexliesenfeld/httpmock#108 to add a delete_after_n method, but this seems unlikely to be merged. We may want to consider moving to wiremock in the long term, which provides this functionality.

david-crespo · 2024-09-16T15:09:47Z

That is so great. "Cleaning up disk" is a little vague. I wonder if it's worth throwing a little more info in like the web console does, both before and after the main upload step. Like you could say "Creating disk xxxxxx", putting it in import mode, then the progress bar, and then when you cancel, the cleanup step makes more sense because it's undoing the steps you called out before. I'm a little on the fence because it's borderline too much detail, but I do like educating the user on what is actually taking place because it empowers them to debug and read the docs if something goes wrong.

wfchandler · 2024-09-16T18:55:55Z

That is so great. "Cleaning up disk" is a little vague. I wonder if it's worth throwing a little more info in like the web console does, both before and after the main upload step. Like you could say "Creating disk xxxxxx", putting it in import mode, then the progress bar, and then when you cancel, the cleanup step makes more sense because it's undoing the steps you called out before. I'm a little on the fence because it's borderline too much detail, but I do like educating the user on what is actually taking place because it empowers them to debug and read the docs if something goes wrong.

Listing the name makes sense, I've added that and replaced the debug formatting with quoted names throughout. Updated demo:

Disk.Upload.Cleanup.-.Updated.mov

augustuswm

Really appreciate this work! Will look through it, just a couple initial notes.

sdk/Cargo.toml

augustuswm · 2024-09-16T19:51:55Z

sdk/src/extras/disk.rs

+            thread_ct: Option<usize>,
+        ) -> Result<
+            (
+                impl Future<Output = Result<()>> + 'a,


I still need to read through everything, but are we able to returned a typed error instead of an anyhow error? I think it is fine at the CLI level to return them, but anyhow errors are quite difficult to handle when getting them back from an sdk.

Good call, I've moved to a custom error type.

sdk/src/extras/disk.rs

ahl

This looks great. I think the approach works.

ahl · 2024-09-20T20:33:25Z

cli/src/cmd_disk.rs

-            return Err(e.into());
-        }
+            pb.finish_and_clear();
+            pb.println("Cleaning up disk. Press CTRL+C again to exit immediately.");


I would suggest we omit the "... exit immediately" text in that basically we don't want people to do that... and in my experience they don't need an invitation to ^C

Ha, very true

ahl · 2024-09-20T20:36:14Z

sdk/src/extras/disk.rs

+    ) -> Result<(), DiskImportError> {
+        self.check_for_existing_disk().await?;
+
+        let result = tokio::select! {


do we want biased; here? i.e. I think we want to prioritize cancelation over progress.

ahl · 2024-09-20T20:37:20Z

sdk/src/extras/disk.rs

+        Ok(())
+    }
+
+    async fn import_disk(&self) -> Result<(), DiskImportError> {


Suggested change

async fn import_disk(&self) -> Result<(), DiskImportError> {

async fn do_disk_import(&self) -> Result<(), DiskImportError> {

What do you think? Hoping for a name that indicates this is the inner loop of disk_import rather than just reversing the two words

Makes sense 👍

ahl · 2024-09-20T20:45:21Z

cli/src/cmd_disk.rs


-                Ok(())
-            }));
+        let pb = start_progress_bar(handle.progress(), disk_info.file_size, &self.disk)?;
+        watch_for_ctrl_c(handle, pb);

-            senders.push(tx);
-        }
-
-        // Read chunks from the file in the file system and send them to the
-        // upload threads.
-        let mut file = File::open(&self.path)?;
-        let mut i = 0;
-        let mut offset = 0;
-
-        let read_result: Result<()> = loop {
-            let mut chunk = Vec::with_capacity(CHUNK_SIZE as usize);
-
-            let n = match file.by_ref().take(CHUNK_SIZE).read_to_end(&mut chunk) {
-                Ok(n) => n,
-                Err(e) => {
-                    eprintln_nopipe!(
-                        "reading from {} failed with {:?}",
-                        self.path.to_string_lossy(),
-                        e,
-                    );
-                    break Err(e.into());
-                }
-            };
+        import_future.await?;


I'm not sure if this is going to be more complex or simpler, but what do you think of doing something like this:

let real_work_handle = tokio::spawn(import_future); select! { // ^C thing // progress thing } real_work_handle.await?;

The difference I see is inlining the work that's spread between the progress bar and ctrl_c watch functions which don't seem to be particularly self-contained.

Perhaps it's about the same, but I wanted to float the idea.

That'a bit cleaner, I combined the two tasks.

Ended up reverting this as exiting the task when the progress bar completes causes us to stop listening for the cancel prior to the task completing, preventing users from canceling late in the request.

From a user perspective uploading a disk is a single operation, but doing so via the SDK is a very involved process, involving stringing together multiple API calls with any interruption leaving the disk in a state that cannot be directly removed. Add a new `extras` module to the SDK that will contain convenience wrappers around actions that functions that are overly complicated. Create a `disk_import` function in `extras` that will perform a disk upload as a single step, and attempts to cleanup partially imported disks on failure. We make the `disk upload` CLI command a thin wrapper around the new SDK function, responsible only for cancellation and progress. We manually recreate the builder interface, as creating a procmacro to do this would require creating a new crate. If/when `extras` continues to grow we should reconsider this. `disk_import` also give callers access to a `DiskImportHandle` with a `watch` channel with the current number of bytes uploaded for use in progress tracking, and provides a `cancel` function that allows users to gracefully stop the disk upload. A disk that has been successfully created will not be removed, only those that are partially imported. The `cancel` function uses `tokio::select` to cancel the `Future` of the upload task. This should be safe in our in our case no locks are held across `await` points. No operations need to be safe to restart as we only call `tokio::select` a single time, they will be dropped immediately and losing a message is not a problem. We are forced to remove tests `test_disk_import_bulk_import_start_fail` and `test_disk_import_bulk_write_import_fail` as we now poll for disks in `Created` state, which may happen if a user immediately cancels a new request. This state is not eligible to be finalized, so we need to wait for the disk to progress to the next state, but `HttpMock` does not provide a way to programatically delete or update a mock after sending `n` responses. This means that the disk will always be returned as non-existing when querying, and cleanup will be skipped. I have opened alexliesenfeld/httpmock#108 to add a `delete_after_n` method, but this seems unlikely to be merged. We may want to consider moving to `wiremock` in the long term, which provides this functionality. Co-authored-by: Adam Leventhal <[email protected]>

This comment was marked as outdated.

Sign in to view

wfchandler force-pushed the wc/disk-import-sdk branch from 2996b6b to 68c9d2b Compare September 16, 2024 18:52

wfchandler requested review from ahl and augustuswm September 16, 2024 19:04

wfchandler marked this pull request as ready for review September 16, 2024 19:06

wfchandler mentioned this pull request Sep 16, 2024

Cleanup partially imported disks on CTRL+C #760

Closed

augustuswm reviewed Sep 16, 2024

View reviewed changes

sdk/src/extras/disk.rs Outdated Show resolved Hide resolved

sdk/src/extras/disk.rs Outdated Show resolved Hide resolved

sdk/src/extras/disk.rs Outdated Show resolved Hide resolved

wfchandler force-pushed the wc/disk-import-sdk branch from 7cc0419 to 5dcce71 Compare September 18, 2024 02:06

wfchandler requested a review from augustuswm September 19, 2024 17:03

ahl approved these changes Sep 20, 2024

View reviewed changes

wfchandler force-pushed the wc/disk-import-sdk branch from 9bcccbf to 06d130d Compare September 23, 2024 16:19

wfchandler force-pushed the wc/disk-import-sdk branch from 06d130d to 2d42630 Compare September 23, 2024 17:40

wfchandler merged commit a2bbd78 into main Sep 23, 2024
16 checks passed

wfchandler deleted the wc/disk-import-sdk branch September 23, 2024 17:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add convenience wrapper for disk uploads to SDK #840

Add convenience wrapper for disk uploads to SDK #840

wfchandler commented Sep 16, 2024

This comment was marked as outdated.

david-crespo commented Sep 16, 2024

wfchandler commented Sep 16, 2024

augustuswm left a comment

augustuswm Sep 16, 2024

wfchandler Sep 19, 2024

ahl left a comment

ahl Sep 20, 2024

wfchandler Sep 23, 2024

ahl Sep 20, 2024

wfchandler Sep 23, 2024

ahl Sep 20, 2024

wfchandler Sep 23, 2024

ahl Sep 20, 2024

wfchandler Sep 23, 2024

wfchandler Sep 23, 2024

	async fn import_disk(&self) -> Result<(), DiskImportError> {
	async fn do_disk_import(&self) -> Result<(), DiskImportError> {

Add convenience wrapper for disk uploads to SDK #840

Add convenience wrapper for disk uploads to SDK #840

Conversation

wfchandler commented Sep 16, 2024

This comment was marked as outdated.

david-crespo commented Sep 16, 2024

wfchandler commented Sep 16, 2024

augustuswm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment