-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async File uploads #4572
Comments
Klaas Freitag commented: # Specification: The upload workflow in oCIS should - as a first step - work like this: Client defined UUIDsClients are allowed to send a UUID in a header of the upload with the first request. That UUID is used as the file ID in oCIS. That way the client does not have to wait for the server side processing to be completed to get the file ID. The file ID is simply set by the client. oCIS needs to check if there is a file ID header and use that for the file. Questions:
Async Client UploadThe following Async Upload will make uploading faster from the clients point of view, because it does not have to wait for any postprocessing or updating of the meta data (which can take some time in oC10). Also, in situations where storage of Metadata and binary data is split (eg. S3) we need this. The processing is as follows:
The most asked question: If a file is removed, because it contains a virus that makes the virus check (which is running in post-processing) deleting the file, how does the client get to know that? Answer: Currently, there is no special info for the client. The file just disappears. It is up to the virus app (or any other that does similar things) to inform the user, ie. through the activity app. Checksums
|
Klaas Freitag commented: Since older clients do not handle the 403 forbidden status for a single file correctly, we will return "423 Locked" instead. That is handled as a Softerror (at least in Desktop), which means: The client does not bother the user with that error, but silently tries to get the file later, which is the correct behaviour. |
Jörn Friedrich Dreyer commented: In effect, this behavior means we queue uploads on the server side. Even if we directly upload all chunks to the final destination, eg as multipart upload for an S3 storage, the final assembly may fail. How do we handle that case? retry indefinitely? ping admin? I actually think this would be ok, because when we shard the uploads per space, the bytes are still available in the space. We can notify admins so he can fix the problem. Or we give a metric on how many uploads are still in progress, so it can be monitored properly. How do we deal with the case the where the client sucessfully transfers all bytes, but a server side workflow like anti virus scanning or file classification makes the finalize request fail? I think we should treat file upload, as in transferring bytes, different than postprocessing. The TUS upload ID can remain accessible until postprocessing is done ... so clients could check the status of the upload ... we could expose workflow steps using headers similar to the Server-Timing header? https://w3c.github.io/server-timing/#the-server-timing-header-field But how often would they need to do that? how will clients get notified when a necessary postprocessing step fails. And ... should clients then delete the file locally? How can they even retry if all they got was a 202 Accepted and later a new etag for the same file ... AFAICT they will redownload and overwrite local changes ... |
Klaas Freitag commented: Reply to [~jfd]:
|
Michael Barz commented: Was discussed in cross-platform meeting:
|
This was implemented on the |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions. |
After a client finished the upload of a file (ie. transfered the last byte), it should get the upload success code immediately if all bytes were transfered successfully.
From the clients perspective, upload means transfering bytes through the network to the server. As soon as that has happened, the client should get a positive answer immediately.
Things that technically might take long time (assembling of huge files after upload) should be transparent to the client. It does not have to wait for it and should not have to poll for success.
Operation steps like virus scan which also might take long time are actually a workflow step and should be handled as that.
It needs to be defined how this can be done properly from the technical and the user experience POV.
The text was updated successfully, but these errors were encountered: