add support for asynchronous building and pushing of profiles #271

PhilTaken · 2024-04-18T21:39:58Z

NB: I have opened this Pull Request as a draft since I intend to continue working on it by improving the logging output.
Since the builds are started concurrently, the build progress information is pretty much useless and flickers a lot.

Problem

The current implementation builds every single profile synchronously, including remote builds.
Since remote builds were introduced back in #175, remote builds could be pipelined to deploy
new configurations in a more time-efficient manner.

Solution

Build configurations that support remote builds concurrently.

Sidenote: I have decided to continue building local builds in a synchronous manner because I have run into hardware deadlocks previously when trying to evaluate and/or build multiple systems at the same time.

I have tested this code by deploying to my personal infrastructure (https://github.com/philtaken/dotfiles) and it works as intended.

rvem · 2024-04-22T10:16:40Z

src/cli.rs

+    // await both the remote builds and the local builds to speed up deployment times
+    try_join!(
+        // remote builds can be run asynchronously since they do not affect the local machine
+        try_join_all(remote_builds.into_iter().map(|data| async {


From try_join_all docs:

If any future returns an error then all other futures will be canceled and an error will be returned immediately

IMO, it might be better to wait for all futures to be completed/failed instead of canceling everything on the first failure, thus the non-failed builds will complete despite the error in an unrelated build. What do you think?

Since the builds are started concurrently, the build progress information is pretty much useless and flickers a lot.

I'm not sure how viable it is to implement, but perhaps, it's possible to collect stdout and stderr for each future, if it is, we can collect and report detailed output synchronously once all remote builds are completed

IMO, it might be better to wait for all futures to be completed/failed instead of canceling everything on the first failure, thus the non-failed builds will complete despite the error in an unrelated build. What do you think?

I'll see how easy it would be to implement that, if it is that might be worth checking out so that consecutive builds don't have to build as much.

perhaps, it's possible to collect stdout and stderr for each future, if it is, we can collect and report detailed output synchronously once all remote builds are completed

I was considering that maybe one line from the bottom per build would be neat e.g.:

[previous output]
host 1: [build progress]
host 2: [build progress]

This will, however, require some more fine grained control over the terminal output i.e. raw output instead of cooked.

would be neat

Agree, but indeed sounds quite non-trivial

Oh, another somewhat conceptual concern. Profiles can potentially depend on each other (see profilesOrder flake option), so perhaps it's worth doing the parallelization on per-host basis instead of per-profile

PhilTaken · 2024-06-10T18:12:42Z

mentioning #46 for visibility

the current version works as expected, the only issue is the log flickering with multiple invocations of nix writing to stdout in parallel. I was considering utilising the raw-internal-json output similarly to how nix-output-monitor does it but that might be out of scope here 🤔

PhilTaken · 2024-06-10T18:13:32Z

activation is fully synchronous but that is usually the part that takes the least amount of time

rvem · 2024-06-14T10:51:22Z

src/cli.rs

+        async {
+            // run local builds synchronously to prevent hardware deadlocks
+            for data in &local_builds {
+                deploy::push::build_profile(data).await.unwrap();


I'm not a huge fan of using "partial functions", isn't unhandled panic from unwrap going to kill the main thread?
Also, AFAICS, at the moment we completely ignore non-remote build/push results

rvem · 2024-06-14T10:51:51Z

The new approach seems good 👍

build remote builds (remote_build = true) asynchronously to speed up the deployment process. local builds should not be run asynchronously to prevent running into hardware deadlocks

PhilTaken · 2024-09-13T06:07:31Z

there are a still lot of unwraps that need to be handled better, this pr is still far from finished but progress has been made:

I added some progress indicators using the indicatif crate for each concurrent build 🥳

This is what they look like (image the spinner spinning and the text changing as nix builds stuff:

For remote and builds each host gets its own progress spinner that is reused for all of that hosts profiles since they are built in order anyways. This could be adjusted for multi-profile hosts since indicatif also supports nested tree-like progress bars.

I changed some data structures around to make working with them in async contexts easier.
The nested references are somewhat ok until you have to make them work in a async context where you need send objects. deep cloning by hand really isn't a great solution and it really doesnt make any difference performance-wise to add some clones here and there.
Nobody is going to deploy to >10k hosts and if they would, cloning the deploy data a few times would be the least of their problems.
This also removes a bunch of generic lifetime specifiers which I am not sad about.

the custom implementation handles indicatif's progressbars better so as to not leave orphaned progress bar fragments when logging while a progressbar is active

PhilTaken · 2024-09-18T11:50:16Z

With these two last commits I would declare this PR somewhat usable 🥳

If anyone wants to try it out with their own setups I would be very happy to hear how it works for you, I am always open for feedback of any kind

ManoftheSea · 2024-09-18T16:50:30Z

Hi all,
I tried this out, with my setup in ManoftheSea/SeaofDirac/trunk, on targets [littlecreek, crunchbits, technetium]. I saw that it ran the checks for all the systems, which meant building them locally, then proceeded with pushing profiles to the systems in parallel (which resulted in building them remotely, I believe). After all three were ready, they deployed in sequence, and as it was just an update, they all completed happily.

This works well for these three systems, which are a pair of VPSes and a local (to me) server; were I trying to deploy all the systems in my home network, I would prefer instead that my development machine (personal laptop) be able to push the profiles to a single server (technetium), so that it can then act as a substituter for e.g. aluminium and nickel. Maybe provide a "build-host" attribute per-host, with a default of the hostself? e.g. deploy.nickel.buildHost = "technetium"; deploy.littlecreek.buildHost = "littlecreek";

PhilTaken requested a review from rvem April 18, 2024 21:39

rvem reviewed Apr 22, 2024

View reviewed changes

PhilTaken force-pushed the phil/async-build-and-push branch from bb4a111 to 1cc6e35 Compare June 10, 2024 17:48

rvem reviewed Jun 14, 2024

View reviewed changes

PhilTaken force-pushed the phil/async-build-and-push branch from 52f5c53 to dd7ec8c Compare June 19, 2024 09:39

PhilTaken requested a review from rvem June 20, 2024 18:54

PhilTaken and others added 4 commits September 11, 2024 13:24

build and push profiles asynchronously

91ce286

build remote builds (remote_build = true) asynchronously to speed up the deployment process. local builds should not be run asynchronously to prevent running into hardware deadlocks

group remote builds by host

a70a339

continue building when a remote build fails and report errors afterwards

c00037e

add indicatif progress indicators for remote and local builds

42ae995

PhilTaken force-pushed the phil/async-build-and-push branch from dd7ec8c to 42ae995 Compare September 13, 2024 05:59

PhilTaken added 2 commits September 18, 2024 13:25

wrap the logger with a custom implementation

02b9681

the custom implementation handles indicatif's progressbars better so as to not leave orphaned progress bar fragments when logging while a progressbar is active

improved errer handling for async deployments

7fa8db8

PhilTaken marked this pull request as ready for review September 30, 2024 11:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add support for asynchronous building and pushing of profiles #271

add support for asynchronous building and pushing of profiles #271

PhilTaken commented Apr 18, 2024

rvem Apr 22, 2024

rvem Apr 22, 2024

PhilTaken Apr 22, 2024

rvem Apr 22, 2024

rvem Apr 22, 2024 •

edited

Loading

PhilTaken commented Jun 10, 2024

PhilTaken commented Jun 10, 2024

rvem Jun 14, 2024

rvem commented Jun 14, 2024

PhilTaken commented Sep 13, 2024 •

edited

Loading

PhilTaken commented Sep 18, 2024

ManoftheSea commented Sep 18, 2024

add support for asynchronous building and pushing of profiles #271

Are you sure you want to change the base?

add support for asynchronous building and pushing of profiles #271

Conversation

PhilTaken commented Apr 18, 2024

Problem

Solution

rvem Apr 22, 2024

Choose a reason for hiding this comment

rvem Apr 22, 2024

Choose a reason for hiding this comment

PhilTaken Apr 22, 2024

Choose a reason for hiding this comment

rvem Apr 22, 2024

Choose a reason for hiding this comment

rvem Apr 22, 2024 • edited Loading

Choose a reason for hiding this comment

PhilTaken commented Jun 10, 2024

PhilTaken commented Jun 10, 2024

rvem Jun 14, 2024

Choose a reason for hiding this comment

rvem commented Jun 14, 2024

PhilTaken commented Sep 13, 2024 • edited Loading

PhilTaken commented Sep 18, 2024

ManoftheSea commented Sep 18, 2024

rvem Apr 22, 2024 •

edited

Loading

PhilTaken commented Sep 13, 2024 •

edited

Loading