Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: Optimize speed and disk usage of backgroudAppend() for load balanced NAS cases #11

Open
rluetzner opened this issue Oct 7, 2021 · 4 comments

Comments

@rluetzner
Copy link
Collaborator

Current Behavior

We've observed that CompleteMultipartUpload requests can potentially take long to complete and waste a lot of disk space when the uploads are done in a distributed load balanced setup.

This is because every MinIO instance involved in the multipart upload (i.e. any instance receiving a PutObjectPart request) will start fs-v1-multipart.go --> backgroundAppend() as a go routine. Every instance generates a unique file for this, so other instances cannot benefit from this optimization. Additionally only the MinIO instance that receives the CompleteMultipartUpload request will move it's aggregated temp file to the final location. All other temp files remain in .minio.sys/tmp until the cleanup process removes them (24h later).

Expected Behavior

When multiple MinIO instances are serving a NAS backend in a load balanced setup, the CompleteMultipartUpload should finish as quickly as possible or at least with a semi-consistent behavior / time taken.

Possible Solution

We will need some kind of cluster-aware backgroundAppend() logic that works for gateway nas and nasxl cases. Maybe etcd can be used to elect a leader or otherwise negotiate the process.

Steps to Reproduce (for bugs)

At least the wasted disk space can also be observed when serving the same local directory with multiple MinIO instances.

  1. Start multiple minio instances with different ports in gateway nas mode, e.g. minio gateway nas --address 9000 ./data.
  2. Use a load balancer, e.g. sidekick: ./sidekick --health-path "/minio/health/live" http://localhost:900{0...3}.
  3. Generate a big file to upload, e.g. truncate -s 350M test.txt
  4. Use mc to upload the test file.
  5. Take a look at .minio.sys/tmp.

Context

We're wasting disk space and completing multipart uploads often take long to complete, depending on which MinIO instance is hit with the CompleteMultipartUpload request.

Regression

Is this issue a regression? No. This issue is also present in the official MinIO code, but they decided not to fix it: minio#13270

@rluetzner
Copy link
Collaborator Author

Distributed locking might help us here. These links might help us with implementing something like this using etcd:

@rluetzner rluetzner linked a pull request Oct 18, 2021 that will close this issue
7 tasks
@rluetzner rluetzner removed a link to a pull request Oct 20, 2021
7 tasks
@rluetzner
Copy link
Collaborator Author

Will also be solved by #19 .

@rluetzner rluetzner mentioned this issue Apr 4, 2022
7 tasks
rluetzner pushed a commit that referenced this issue Apr 20, 2022
- Also simplify helper function to cleanup transitioned objects on expiry or when
they are 'overwritten'
- Use expireTransitionedObjects in delete-object, multi-delete-objects and
put-object API handlers
@rluetzner
Copy link
Collaborator Author

#19 should improve the situation.

@rluetzner rluetzner reopened this Jun 1, 2022
@rluetzner
Copy link
Collaborator Author

We decided to revert this change for now.

@rluetzner rluetzner mentioned this issue Jun 2, 2022
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants