-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature(local-disks): Add support for raid10 on instance storage #1549
Conversation
ab2e1d7
to
26731d7
Compare
My only concern here is the initial resync bit. The documentation implies that an initial resync is never really necessary, but also...is? I don't know enough about RAID to make a call one way or the other; but we need to be confident that the array will be healthy and performant in a production environment before we merge this. |
RAID5 does need initial resync as per documentation too and it is not safe to skip it. I would scope out raid5 case here though. So what would happen in initial resync is going through each block in each disk of a mirror and write a zero to ensure disks are identical. Nothing more. This is actually even harmful for SSDs. The reason initial resync isn't "strictly" needed is that if there's a write to the filesystem, the blocks is written to all the mirrored disks anyway, so no potential to have inconsistent data on reads. Is there a performance concern if one disk is trimmed and one disk is not? I guess there could be, but we also have a documented[2] guarantee that the disks are fully trimmed. Some other corner cases could come up in a recovery situations where we have disk that has data and we perhaps try a recovery. That's on the other hand a situation we should never end up with AWS Instance Storage. Is there something more I could do to bring some confidence? |
That's my concern; if someone is counting on this redundancy and it doesn't work when they need it to, that's pretty terrible. That being said, this will likely always be opt-in. Can you add some notes to the doc that, for now, this option is "experimental" or "use at your own risk" until we have some feedback in the wild? |
This doesn't affect redundancy, because there's no way to replace a single SSD on the instance. The whole instance will be replaced at once and then all the disks are clean ones. This just helps for graceful termination, so that workloads can be migrated away before the instance is replaced.
Yes, makes sense, will do. |
Ah, I see! Thanks. I think we're good to go after the doc changes. 👍 |
26731d7
to
2f1bcc4
Compare
Would that do? https://github.com/awslabs/amazon-eks-ami/pull/1549/files#diff-da030e6a3883ea1c1765967f4609e0e675451db33557df79c65c2a07b02c8e6fR390 |
@cartermckinnon would we be happy with the doc changes mentioned above? |
This LGTM. Can you rebase to address the conflicts? This PR got caught in the lurch of our branch rename. |
2f1bcc4
to
ea76763
Compare
This removes the wait block for raid resync for two reasons: 1) raid0 does not have redundancy and therefore no initial resync[1] 2) with raid10 the resync time for 4x 1.9TB disks takes from tens of minutes to multiple hours, depending on sysctl params `dev.raid.speed_limit_min` and `dev.raid.speed_limit_max` and the speed of the disks. Initial resync for raid10 is not strictly needed[1] Filesystem creation: by default `mkfs.xfs` attempts to TRIM the drive. This is also something that can take tens of minutes or hours, depening on the size of drives. TRIM can be skipped, as instances are delivered with disks fully trimmed[2]. [1] https://raid.wiki.kernel.org/index.php/Initial_Array_Creation [2] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ssd-instance-store.html#InstanceStoreTrimSupport
ea76763
to
3276ef3
Compare
@cartermckinnon now rebased |
Well.. this is embarrassing, but there was and asg depending on my branch with the old path before the repo restructuring, so I think I have to create a new pull request, as the PR branch can't be changed as far as I can tell.. Sorry about that 😅 Opened a new one: #1900 |
Description of changes:
I would like to be able to migrate workloads away from a node gracefully in case of instance storage drive failure. Raid10 would provide redundancy and trade off disk space.
Adding support for creating raid10 in addition to raid0. This also removes the wait block for raid resync for two reasons:
dev.raid.speed_limit_min
anddev.raid.speed_limit_max
and the speed of the disks. Initial resync for raid10 is not strictly needed[1]filesystem creation: by default
mkfs.xfs
attempts to TRIM the drive. This is also something that can take tens of minutes or hours, depening on the size of drives. TRIM can be skipped, as instances are delivered with disks fully trimmed[2].[1] https://raid.wiki.kernel.org/index.php/Initial_Array_Creation
[2] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ssd-instance-store.html#InstanceStoreTrimSupport
Testing Done
on
m6id.metal
with kernel defaults:With increased resync limits: