-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds capability to provision directories on the EFS dynamically #732
base: master
Are you sure you want to change the base?
Adds capability to provision directories on the EFS dynamically #732
Conversation
Hi @jonathanrainer. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@Ashley-wenyizha Hi, could I get an approved to test on here so I can start sorting out the E2E tests? |
@Ashley-wenyizha @wongma7 @jsafrane I truly believe a feature like this PR is critical for the EFS CSI driver to remain useful, and I would greatly appreciate your thoughts on this PR. With the current approach, people will often get themselves into a situation where they are suddenly unable to provision new PVCs (once they hit 120 EFS "Access Points" in an AWS Account) and then have no idea what is wrong, let alone be able to fix it without moving to another CSI driver. The |
/ok-to-test |
Great work. We are using the #640 for a while, but the 120 access points are a real blocker for us, so this looks like a better solution for our problem in the near future. We only have two problems, and one of them, as you said in the description are solved by the #640 and that is the
Using the #640 we are able to do this. The other problem we see, is that, since you are not using access points, all the files in the EFS are written as root user. Why not use a single access point per storage class to do all the work? Thanks for the hard work and hope this two PR's are merged soon. |
@jonathanrainer Thanks for this PR. It solves a big problem for us where we can't use APs because we don't use Amazon DNS. I took this and merged with #687 (fixing some merge conflicts along the way) and they both work great together. One thing, which may or may not be related to the combined PR on my end, but for directory provisioning, when the SC reclaimPolicy: delete the created directory does not get deleted. If I set basePath: /foo/bar when the PVC and PV are deleted, /foo/bar/pvc-unique-id directory still remains. |
Thanks for this @jonathanrainer! I've done a build of this and deployed to our cluster where I've force-updated the EFS storage class to use directory provisioning. Existing PVs all still mount correctly via access points and new PVs are being provisioned correctly using directories. |
Hello, any news on this? I think it's critical to merge it for this driver's future. |
@jonathanrainer can you please rebase the PR? |
@jsafrane Sorry I've kind of let this one get away from me a bit, I've just changed job but should be able to dedicate some time to this later in the week. Think it needs a rebase and the E2Es need looking at but it's nearly there. |
7a5e93e
to
f7664a7
Compare
8126091
to
26f4763
Compare
Now we have an AccessPoint and Directory provisioner they should live in different files with their own tests. We also need an OsClient to stop us having to run the tests as root which could cause issues in CI/CD. Sets the directory permissions to 777 by default and doesn't set an owner to fix the problem of how users would know what UID/GID to use. Further fixes a problem where if an unmount failed it would delete the contents of the EFS.
3a3e2c1
to
10ffc84
Compare
@jonathanrainer: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@Ashley-wenyizha Hi, have rebased this but I know now the E2Es are failing (as is everything else), I'm not planning to do work to fix this yet because I want to wait for the other PR to merge anyway as I think that might be quite a complicated rebase as it is. Once that's done I'll rebase and fix everything up at the same time and then it should be ready for review properly. Once that's done I'll post the one-pager as well and then it can all be reviewed together. Is that ok? |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Any update on when this feature might be available? |
@jonathanrainer I am sorry this PR takes more that one year, can you please rebase again? |
Hi @jonathanrainer. Any chance you will have time for above in the near future? Your work is much appreciate⭐ |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
@jonathanrainer would you be able to rebase this PR? |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove-lifecycle rotten |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
This PR was the result of a hackathon at my current company as well as some work outside to sure up the unit tests and E2E test the final product.
Is this a bug fix or adding new feature?
New feature, requested in #538 and #517 and possibly other issues, this comes up a lot as a feature people would like.
What is this PR about? / Why do we need it?
Since its creation the driver has supported Access Point Provisioning as its main means of dynamic provisioning. However this is problematic for a few reasons, it can causes issues with user permissions around deleting provisioned directories and there is a hard limit of 120 AccessPoints per EFS which for some use cases is very quickly depleted. Further it requires various AWS IAM Permissions that can be complicated to sort out and manage.
This PR allows a new provisioning mode called
efs-dir
so that instead of creating EFS access points, directories are created instead. This new provisioning mode supports all of the previous parameters (UID, GID, GID Ranges etc.) and so is very easy to transition to, or run alongside the access point provisioning style if preferred. At present the directories are named after the PersistentVolume that is created to ensure a unique directory name and not run into security problems but once #640 merges there's no reason why the same method wouldn't apply to this provisioning method as well with a little bit of extra work. Either this PR can be updated or #640 can be extended if this merges first.This is achieved by creating a new Interface called Provisioner that is implemented by an AccessPointProvisioner (the original method) and a DirectoryProvisioner (the new method) this also allows in future for different kinds of provisioning to occur, maybe FileSystemProvisioning for example.
What testing is done?
I've moved around some of the Unit Tests that used to relate to
controller.go
into new files that reference each of the provisioners. These all run 🟢 and I'm fairly happy I've covered most of the important cases. Overall I'm not 100% sure about the placement of the tests and the coverage but it definitely covers all the new code and the code that existed and replicates existing behaviour.I've also run this through on my own EKS cluster with the
deleteProvisionedDir
flag on and off and it performs exactly as I anticipated it would. Directories get provisioned, and then when the PVC gets deleted they either are deleted or remain. I've also written E2E tests to run in CI for creation and deletion but will need the PR to be approved before I can road test those.fixes #538