Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/slow cmu #35

Merged
merged 3 commits into from
Jun 23, 2022
Merged

Fix/slow cmu #35

merged 3 commits into from
Jun 23, 2022

Conversation

rluetzner
Copy link
Collaborator

Description

In combination with a GlusterFS backend we realized that the CompleteMultipartUpload request becomes very slow. The larger the uploaded filesize is, the longer the request will take.

This is reapplying #19 . The code hasn't changed since being reviewed, but we previously decided not to include it in our next release.

Motivation and Context

We noticed that for larger filesizes some clients would run into a timeout during CompleteMultipartUpload requests. The default timeout seems to be 1h, which is pretty big. Nevertheless uploading 20 or 50GiB files caused CMU requests that took way longer to complete.

Compared to AWS it is non-intuitive that a CMU request should take so long, but we're willing to accept it, because it's a limitation of the current implementation, which has to put all chunks together into one big file.

This fixes #13 and #11 .

How to test this PR?

Tests can easily be performed manually. time is helpful in tracking the total duration on the client side, whereas mc admin trace -v -a will log all requests including their duration.
If large files are required truncate -s 100G file can help you quickly create a file containing only zeros. Without using compression, this is just as good as creating such a big file with actual content, but waaaaay faster.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Optimization (provides speedup with no functional changes)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • Fixes a regression (If yes, please add commit-id or PR # here)
  • Documentation updated
  • Unit tests added/updated

@rluetzner rluetzner requested a review from aweisser June 22, 2022 11:39
Robert Lützner added 3 commits June 22, 2022 11:40
There's no need to continously stat all object parts on disk if most of
them have already been appended.

(cherry picked from commit 3d4fb6a)
The backgroundAppend was meant as a performance optimization where
uploaded chunks would be aggregated to a complete file in the background
while more parts were uploaded at the same time.

This might be an optimization on actual filesystem paths, but on Gluster
it made the operation a lot slower because many redundant filesystem
calls were executed. The CompleteMultipartUpload request which requires
the uploaded chunks to be put together has to

1. wait for any ongoing backgroundAppend operations to complete,
2. enumerate all chunks and check if they were put together in the right
   order (otherwise start from scratch),
3. move the aggregated file.

Removing this piece of code started out as an experiment, because I
expected the chunks to not be aggregated at all. It turned out that
there is a fallback which is also necessary in case the final object
should have a different order or not contain some of the uploaded parts.
This also ensures that a final aggregate is created. At least on
GlusterFS this makes any CMU request run almost twice as fast as when
using backgroundAppend.

(cherry picked from commit 4ac0b96)
(cherry picked from commit 9137a44)
@rluetzner
Copy link
Collaborator Author

We verified that the change works and CompleteMultipartRequests are indeed much faster than before. There are no visible regressions in the unit tests or the Mint test suite.
I've rebased the commits onto the current state of iternity-rb, but other than that nothing's changed in the meantime.

Copy link
Collaborator

@aweisser aweisser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already went through this change together. It still lgtm.

@rluetzner rluetzner merged commit d1cddf5 into iternity-rb Jun 23, 2022
@rluetzner rluetzner deleted the fix/slow-cmu-rb branch June 23, 2022 08:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multipart Upload: CompleteMultipartUpload takes multiple hours to complete for very big files
2 participants