Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

suspend_resume_single: clear pool errors on fail #17054

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

robn
Copy link
Member

@robn robn commented Feb 14, 2025

[Sponsors: Klara, Inc., Wasabi Technology, Inc.]

Motivation and Context

While working on something I kept hitting this test failing. And yes, I had bugs, but it sucked that rather than failing the test the entire kernel would just hang up.

The upshot is that if the timing is unfortunate, the pool can suspend just as we're failing because it didn't suspend. If we don't resume the pool, we hang trying to destroy it.

Description

Clear errors (resuming the pool) before trying to destroy the pool and clean up.

How Has This Been Tested?

Used to hit it fairly often in ZTS runs, now I don't (once attendant bugs were fixed).

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

If the timing is unfortunate, the pool can suspend just as we're failing
because it didn't suspend. If we don't resume the pool, we hang trying
to destroy it.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <[email protected]>
@robn robn mentioned this pull request Feb 18, 2025
13 tasks
Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no objections, but I wonder what cases it supposed to handle, especially if you won't recover the block device before it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants