Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unlock output paths when a derivation is already built #6469

Merged
merged 1 commit into from
Nov 16, 2023

Conversation

gbpdt
Copy link
Contributor

@gbpdt gbpdt commented Apr 29, 2022

Without this change, nix build processes will not drop the locks for derivation goals
which have already been built by another process when the current process gets
round to building them. This means the locks are held until the process
terminates.

If there are other nix build processes in a similar state, they will also try to
acquire the same locks when they try to build the same derivation, and so will
wait until the lock holder terminates (which might be a very long time if it has
a lot to build). In some pathological cases, those processes might be holding
their own locks on other derivations due to the same issue, and this can lead to
deadlock.

Resolves #6468

@@ -627,6 +627,7 @@ void DerivationGoal::tryToBuild()
if (buildMode != bmCheck && allValid) {
debug("skipping build of derivation '%s', someone beat us to it", worker.store.printStorePath(drvPath));
outputLocks.setDeletion(true);
outputLocks.unlock();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible this might be better placed in done, or perhaps buildDone could be used (I'm not too familiar with the code).

@edolstra
Copy link
Member

edolstra commented May 2, 2022

This means the locks are held until the process terminates.

I don't think this is the case in general. outputLocks will be released via RAII from the goal's destructor. However, this may take a long time for top-level goals.

@matthewbauer
Copy link
Member

Goals are only destructed at the end of Store::buildPaths, so it will wait for all sibling goals to finish too IIUC. It seems like we should unlock these paths any time AlreadyValid comes up.

@gbpdt
Copy link
Contributor Author

gbpdt commented May 3, 2022

Goals are only destructed at the end of Store::buildPaths, so it will wait for all sibling goals to finish too IIUC.

Yes that matches the behaviour we are seeing. We have a top-level goal with a fairly 'wide' graph of dependencies underneath it. We see already-built goals in the middle of the graph hang around for some time while their siblings are being built, sometimes long enough to cause deadlock with other nix build processes building a similar dependency graph.

It seems like we should unlock these paths any time AlreadyValid comes up.

I took a look through the other cases and I couldn't immediately see any others that actually take the locks, although I may well have missed something. It looks like the other parts that might take locks (e.g. repairing) create new derivation goals which should go through the same code paths.

@gbpdt
Copy link
Contributor Author

gbpdt commented May 13, 2022

We've been running this patch in our production Nix (2.3) builds for over a week now and have observed no deadlocks or other problems, so I'd advocate merging it.

@bjornfor
Copy link
Contributor

I've also seen deadlocks, with nix-2.7.0. I wasn't able to reproduce the issue outside of the production env, but hoping it's the same issue and that this fixes it.

@gbpdt
Copy link
Contributor Author

gbpdt commented Sep 16, 2022

@edolstra any way we could get this merged for a future Nix release?

@gbpdt gbpdt changed the title Proposal: Unlock output paths when a derivation is already built Unlock output paths when a derivation is already built Feb 1, 2023
@fricklerhandwerk fricklerhandwerk added store Issues and pull requests concerning the Nix store bug performance and removed bug labels Mar 3, 2023
@fricklerhandwerk
Copy link
Contributor

fricklerhandwerk commented Apr 28, 2023

Triaged in Nix team meeting:

@gbpdt please move the unlock to done, and we can merge it.

Side notes:

  • @edolstra: ideally we'd rearchitect and split Goals and GoalResults so that the Goal destructor releases resources quickly
  • @thufschmitt: that would be nice but involve a hairy refactoring, so it's not an immediate priority

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-04-28-nix-team-meeting-minutes-50/27698/1

@github-actions github-actions bot removed the store Issues and pull requests concerning the Nix store label Sep 9, 2023
Without this change, nix build processes will not drop the locks for derivation goals
which have already been built by another process when the current process gets
round to building them. This means the locks are held until the process
terminates.

If there are other nix build processes in a similar state, they will also try to
acquire the same locks when they try to build the same derivation, and so will
wait until the lock holder terminates (which might be a very long time if it has
a lot to build). In some pathological cases, those processes might be holding
their own locks on other derivations due to the same issue, and this can lead to
deadlock.

Resolves NixOS#6468
@gbpdt
Copy link
Contributor Author

gbpdt commented Sep 9, 2023

Triaged in Nix team meeting:

@gbpdt please move the unlock to done, and we can merge it.

Apologies @fricklerhandwerk, I missed this, thanks for taking a look! I've made the change (just moving to done, I didn't try to consolidate any other call sites of outputLocks.unlock()). PTAL

@thufschmitt thufschmitt merged commit 7ba4e07 into NixOS:master Nov 16, 2023
8 checks passed
tebowy pushed a commit to tebowy/nix that referenced this pull request Jul 11, 2024
Unlock output paths when a derivation is already built

(cherry picked from commit 7ba4e07)
Change-Id: I9de077679290d5141a610ac43d99d3a43acff87c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

nix build does not unlock when a derivation is already built
7 participants