Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of generic pre build and post build hooks. #9892

Open
erikd opened this issue Apr 16, 2024 · 42 comments
Open

Addition of generic pre build and post build hooks. #9892

erikd opened this issue Apr 16, 2024 · 42 comments

Comments

@erikd
Copy link
Member

erikd commented Apr 16, 2024

Describe the feature request
The ability to call shell scripts just before and just after each package is built.

At IOG, our use case for these is for build caching, particularly in CI. Since these pre and post hooks are just shell scripts, there are probably other uses for them.

Additional context
At IOG we have a number of very large Haskell projects with deep dependency trees, that can take a long time to build in CI.

The obvious answer to long build times is caching of build products. A previous attempt at this caching was made, but that solution was not really very satisfactory, because the cache was keyed on ${CPU}-${OS}-${GHC_VERSION)-${hash-of-dependencies}. The first three are obvious. The problem is hash-of-dependencies. If a single high level dependency changes, there will be no cache hit and everything will be built from scratch. As it turns out, this is actually the most common scenario.

A better caching solution is one where the caching is done on individual dependencies rather than on all the dependencies as a huge blob. Caching individual dependencies means that when a high level dependency changes, there is a very high likelihood that all the lower level dependencies will still be found in the cache.

My initial implementation of this package level caching was done as a simple wrapper around cabal that used rsync to fetch and save the cache over ssh to another machine. This proved highly effective and I was able to populate the cache from one machine and use if from another (both machines running Debian Linux).

However, @angerman came up with an even better solution that required adding the ability to run shell scripts before and after the build of each individual package. Using this feature (we have rough patches against cabal HEAD and version 3.10.3.0) we are able to use our own Amazon S3 storage for our cache. We do not propose to make this S3 storage public (obvious potential security problems) but any organization like ours or any individual could use their own S3 storage. I also have a working pair of pre and post build hooks that use ssh to a different machine as the storage backend.

The naive patch against HEAD (error handling could be improved, maybe the hook names could be made configurable) is:

diff --git a/cabal-install/src/Distribution/Client/ProjectBuilding/UnpackedPackage.hs b/cabal-install/src/Distribution/Client/ProjectBuilding/UnpackedPackage.hs
index 065334d5c..570c9b18c 100644
--- a/cabal-install/src/Distribution/Client/ProjectBuilding/UnpackedPackage.hs
+++ b/cabal-install/src/Distribution/Client/ProjectBuilding/UnpackedPackage.hs
@@ -678,7 +678,22 @@ buildAndInstallUnpackedPackage
           runConfigure
         PBBuildPhase{runBuild} -> do
           noticeProgress ProgressBuilding
-          runBuild
+          -- run preBuildHook. If it returns with 0, we assume the build was
+          -- successful. If not, run the build.
+          code <- rawSystemExitCode verbosity (Just srcdir) "preBuildHook" [
+              (unUnitId $ installedUnitId rpkg)
+            , (getSymbolicPath srcdir)
+            , (getSymbolicPath builddir)
+            ] `catchIO` (\_ -> return (ExitFailure 10))
+          when (code /= ExitSuccess) $ do
+            runBuild
+            -- not sure, if we want to care about a failed postBuildHook?
+            void $ rawSystemExitCode verbosity (Just srcdir) "postBuildHook" [
+                (unUnitId $ installedUnitId rpkg)
+              , (getSymbolicPath srcdir)
+              , (getSymbolicPath builddir)
+              ] `catchIO` (\_ -> return (ExitFailure 10))
+
         PBHaddockPhase{runHaddock} -> do
           noticeProgress ProgressHaddock
           runHaddock

An example of our preBuildHook script (kept in ~/.cabal/iog-hooks, S3 credentials pulled from s3-credentials.bash, the aws executable is from the awscli package) is as follows:

#!/usr/bin/env bash

# shellcheck disable=SC1091,SC2046
. $(dirname "${BASH_SOURCE[0]}")/s3-credentials.bash

CACHE_KEY="$1.tar.gz"

# Check if artifact exists in S3 (with endpoint and credentials)
aws s3 ls s3://"$CACHE_BUCKET"/"$CACHE_KEY" --endpoint-url "$AWS_ENDPOINT" > /dev/null 2>&1

# shellcheck disable=SC2181
if [ $? -eq 0 ]; then
  echo "S3 hit       $1"
  aws s3 cp s3://"$CACHE_BUCKET"/"$CACHE_KEY" - --endpoint-url "$AWS_ENDPOINT" | tar -xz
else
  echo "S3 miss      $1"
  exit 1
fi

To use the cache I run cabal as:

PATH=$PATH:$HOME/.cabal/iog-hooks cabal 
@erikd
Copy link
Member Author

erikd commented Apr 16, 2024

Example output using the ssh storage backend:

...
Starting     criterion-1.6.3.0 (lib)
Building     criterion-1.6.3.0 (lib)
SSH miss     criterion-1.6.3.0-4239da68ccfe796a2fcfc4950e53b50c0b0b508391999faf9fb38d92e10aae86
SSH upload   criterion-1.6.3.0-4239da68ccfe796a2fcfc4950e53b50c0b0b508391999faf9fb38d92e10aae86
Installing   criterion-1.6.3.0 (lib)
Completed    criterion-1.6.3.0 (lib)

...

@fgaz
Copy link
Member

fgaz commented Apr 16, 2024

Related: #7394

I think this would be a welcome feature, especially if it's written with the above issue in mind so it can be extended to other actions.

@erikd
Copy link
Member Author

erikd commented Apr 16, 2024

I just did a test using ghc-9.6 on a machine with 12 cores and 48 of RAM (ie significantly more powerful than expected CI build machines). The version of cabal being used is a patched 3.10.3.0.

With an empty S3 cache:

cabal clean
rm -rf ~/.cabal/store/ghc-9.6.4/
time PATH=$PATH:$HOME/.cabal/iog-hooks/ cabal build all

real    29m53.885s
user    105m12.177s
sys     11m5.029s

With a warm S3 cache (ie immediately after the above):

cabal clean
rm -rf ~/.cabal/store/ghc-9.6.4/
time PATH=$PATH:$HOME/.cabal/iog-hooks/ cabal build all


real    13m20.817s
user    37m24.021s
sys     6m58.173s

Without the S3 cache, but just using ~/.cabal/store/ghc-9.6.4/:

cabal clean
time cabal build all

real	4m0.510s
user	8m6.877s
sys	0m39.885s

@mpickering
Copy link
Collaborator

Is there an advantage to doing this using a hook rather than populating the caches based on plan.json before starting to do any building?

It seems that using an approach like cabal-cache, using plan.json would be easier and wouldn't require modifying cabal-install.

@newhoggy
Copy link

newhoggy commented Apr 16, 2024

Does this issue propose to fix this?

Note that because builds are not reproducible in general (nor even necessarily ABI compatible) then it is essential that the loser abandon their build and use the one installed by the winner, so that subsequent packages are built against the exact package from the store rather than some morally equivalent package that may not be ABI compatible.

References:

@newhoggy
Copy link

newhoggy commented Apr 16, 2024

There is another I've issue encountered, such as packages having absolute paths built into them which mean they are non-transferrable to other machines where the build location, store location or some other environment differs.

I have some examples of such packages:

➜  tmp aws s3 ls s3://cache.haskellworks.io/archive/v2/f58b96473c/ghc-9.8.2/
2024-03-01 21:17:12     104869 ansi-terminal-1.0.2-a01f3219a7a4a7a4eb77ff621a0ed6b6e450136b05407e626307f299190a71f4.tar.gz
2024-03-01 21:17:12     180356 ansi-terminal-types-0.11.5-6827843e420e0d028f9e48e791f67001efa3cb47d381e9b165cc8f607a821d78.tar.gz
2024-03-01 21:17:13     391194 colour-2.3.6-c937c1fae7c2f789054e5eb5c3a1da6e4af1f236fc621e02a544dc78739e0a37.tar.gz
2024-03-01 21:17:13     260291 concurrent-output-1.10.20-e5726e733cf6304131aaf02376ead2ea69cba2bd2247c84253034ffb886d3da1.tar.gz
2024-03-01 21:17:14     763100 happy-1.20.1.1-e-happy-7b1696a89248290667aee02ca6d97d8d7bc03d6bc936a5b79e3d972cf6149914.tar.gz
2024-03-01 21:17:15    1842811 hedgehog-1.4-f3dd734526516a61afde47d069e4ff9e6fb5b254943565727396b95e10a9d87a.tar.gz
2024-03-01 21:17:14     490535 hedgehog-extras-0.6.1.0-5b53412e9dce31c1bf0c3c1946a5f3a121c09d3f879ee270bbeb28933031de3d.tar.gz
2024-03-04 20:56:49     490469 hedgehog-extras-0.6.1.0-a717c8aeedaa81cf162a2936a8af86a26f5e25dfcf03b8c4768137b7dffbcce2.tar.gz
2024-03-01 21:17:15     704096 hsc2hs-0.68.10-e-hsc2hs-1022064a9b8047005f5a964e5ffc29aa050879f17b2d8b0353edd710087bd83b.tar.gz
2024-03-01 21:17:16     672138 optparse-applicative-0.18.1.0-29c379e4e141e0fc10805ddbd48a772bb236d5ed446942f8f177b5fa43bec33b.tar.gz
2024-03-01 21:17:16     291745 pretty-show-1.10-767af4c07eab91ce69eba890ed7ef3435f397b0b0db4843a5ff7993e73e3af84.tar.gz
2024-03-01 21:17:16     127976 prettyprinter-ansi-terminal-1.1.3-87e22b7729218bd826726f9ba761abe4444cfdcc37607e1594238643531220cd.tar.gz
2024-03-01 21:17:17     752879 tasty-1.5-f5a2ece4f2fd1a09fbad0832fdd84bf0497cee2ac510cd1c50565175637cf596.tar.gz
2024-03-01 21:17:17     141137 tasty-discover-5.0.0-5fcc9e3560b4cb8470b7c13d067c5390dc0a7993801da102d8092fdc5dc2b61c.tar.gz
2024-03-01 21:17:17     612782 tasty-discover-5.0.0-e-tasty-discover-bd4ef76688b231ecc5ef6f0d99866cf056e55532494977897dc649f679ed4368.tar.gz
2024-03-01 21:17:17      84577 tasty-hedgehog-1.4.0.2-343fcb9e5a2dae3fc4bb0ea138d1ed3e745ab9fbc7dba38e113eaceb9b922636.tar.gz

cabal-cache currently attempts to detect such cases and attempts to mitigate the problem by not transferring such packages if it would break for this reason.

@newhoggy
Copy link

Is there an advantage to doing this using a hook rather than populating the caches based on plan.json before starting to do any building?

Brainstorming a bit:

I can see this potentially evolving to the point where in CI or even on development machines we don't build any of the dependencies because we can point to a caching build server. If we reach this point the advantage is that if we have a large projects, components that depend on already cached dependencies can start building earlier whilst the remote build server is busy populating the cache for yet unbuilt dependencies.

It could be a paid for cloud service as well, maybe something the Haskell Foundation can run?

The advantage of this would be minimal idle time by build client.

@angerman
Copy link
Collaborator

There are a few advantages to hooks that you do not get from reconstructing the store outside of cabal. As @erikd showed the patch is currently also ~20lines of extra code in cabals codebase and it permits all kinds of use for the hooks. You could run some time profiling with them, or anything else you'd like to do around builds.

  • By being at the location where it is, and effectively substituting GHC's build, you can omit _Paths' modules from the reconstruction, and have GHC build paths modules for you.
  • It's not trying to reconstruct the store, and leaves all the store logic (layout, ...) up to cabal. Inlcuding .conf generation, copying and installing. (Most of the logic of cabal is retained in cabal, and the hooks do not have to concern themselves with any of that.

ABI is a major issue. Luckily for us we are getting close to making it irrelevant for us.

@angerman
Copy link
Collaborator

One last note: these are generic hooks. Caching may be one use for them. But hooks into cabal have been requested (and discussed outside of issue trackers) multiple times.

@newhoggy
Copy link

Can you elaborate on the _Paths modules? Are they still going to be included in the packages but ghc patches after the fact? Or will they be gone from the packages?

@newhoggy
Copy link

I wanted to understand if pre/post build hooks are the only ones we need. For example do we need a no build hook because the package was already built?

@angerman
Copy link
Collaborator

@newhoggy caching is clearly only one instance of what could be done with these hooks. (And we likely want cabal to have more hooks!).

Can you elaborate on the _Paths modules? Are they still going to be included in the packages but ghc patches after the fact? Or will they be gone from the packages?

You can just not backup any _Paths modules in the tarball you build --exclude=*_Paths.* or similar.

When you restore the cached objects into the build folder ghc will just build the ones that are missing. After that cabal will continue with copy/reg/install.

I wanted to understand if pre/post build hooks are the only ones we need. For example do we need a no build hook because the package was already built?

For this PoC, the pre/post hooks seem to be enough. However, you'd likely want to have many more hooks in cabal to hook different phases for experimentation, and extensions.

@hasufell
Copy link
Member

hasufell commented Apr 16, 2024

Related: #7394

I think this would be a welcome feature, especially if it's written with the above issue in mind so it can be extended to other actions.

Exactly. This has been requested/discussed since the dawn of time.

Shell hooks are as old as unix and the other major design pattern after "pipes". I'm not sure it even warrants a discussion... and we surely shouldn't be discussing specific use cases. There are infinite.

The question is more about:

  • where exactly do we want to insert hooks
  • what information should be exposed to the hooks

I think last time it was discussed, it wasn't really clear where to insert hooks, because cabal had no clean architecture that resembles strictly the configure/build/install phases that traditional package managers have.

So this would be the only concern I see: which parts are stable enough architecturally that they won't break hooks in the future. Or do we need a refactor first?

@geekosaur
Copy link
Collaborator

Another place I'd consider a cabal hook is SCM support (cf. #9883) — ideally we'd just drop a script somewhere rather than needing to add it into cabal-install and make a release. (Also goes for the discussion currently going on on Matrix about possibly having cabal sdist generate source-repository stanzas, although that one would go in the opposite direction.)

@michaelpj
Copy link
Collaborator

One obvious concern with this is portability, right? At the moment most of the stuff that cabal does is limited to a known set of things that cabal handles and can be tested. Users can't currently inject random non-portable stuff, but these kinds of hooks would make it very easy to do that.

(Of course you can do this today with build-type: Custom or TH or something, but it's perhaps a bit trickier?)

@hasufell
Copy link
Member

hasufell commented Apr 16, 2024

One obvious concern with this is portability, right?

I'm not sure I understand this statement.

User hooks are not in scope to be tested by cabal. It's irrelevant what users do with them. If a user manages to write a package that only works with some user hooks supplied, then that's simply a broken package.


What @geekosaur means, IMO, is a set of shell scripts that can act as pluggable SCM implementations. Cabal could e.g. ship them (or inline them) and allow users to overwrite them, if they want. This is very common for how source distro package managers are implemented. There's a clear separation between what is shipped and what the user wrote/configures.

But this is somewhat digressing from the original proposal. A user shell hook is just a shell script that's executed. It is not a replacement for any internal cabal phase.

@michaelpj
Copy link
Collaborator

I meant that it's easier for users to accidentally make packages that are non-portable, since now they have to be sure those shell scripts will run in all situations where a downstream user is compiling their package.

@hasufell
Copy link
Member

meant that it's easier for users to accidentally make packages that are non-portable

I don't see how. Can you give an example? The shell hook doesn't change any internal data structures of cabal. The worst it could do is generate files that are needed for compilation, in which case it is a broken package.

I don't see why users would think shell hooks are the right place to do this.

@geekosaur
Copy link
Collaborator

Because users will find a way to abuse anything you give them? </sysadmin>

@angerman
Copy link
Collaborator

I meant that it's easier for users to accidentally make packages that are non-portable, since now they have to be sure those shell scripts will run in all situations where a downstream user is compiling their package.

If someone is relying on some random hooks to exist at some downstream user for their package, they are arguably doing it wrong. You do not ship hooks with your package.

Hooks are end user customizations, nothing that package authors should consider. This is not BuildHooks.

As an emacs package author you also do not rely on, or test all possible combinations of, hooks an end user might put into their config.

Adding hooks to your cabal install means you (the end user) get a modified cabal. That's on you. Always.

@erikd
Copy link
Member Author

erikd commented Apr 17, 2024

Is there an advantage to doing this using a hook rather than populating the caches based on plan.json before starting to do any building?

Yes, the hook version is 30% slower than populating cache based on plan.json and using rsync to move files from a remove cache machine. The cabal hook version is about 40% faster when populating an empty remote cache.

@erikd
Copy link
Member Author

erikd commented Apr 17, 2024

Does this issue propose to fix this?

I believe so, yes. The cache uses <package-name>-<package-version>-<config-hash> as the cache key. For a given <package-name>-<package-version> the <config-hash> will be different for any single one of different sets of cabal flags, different CPUs, different OSes etc.

@gbaz
Copy link
Collaborator

gbaz commented Apr 17, 2024

Not suggesting this per-se, but would it be possible to "fake" this particular set of hooks by wrapping ghc itself?

@angerman
Copy link
Collaborator

Not suggesting this per-se, but would it be possible to "fake" this particular set of hooks by wrapping ghc itself?

Yes. Conceptually you could. Though it's harder to say why you were called if you wrap GHC. You could try to use some heuristics to determine the caller, hooks are more explicit in that scenario.

@erikd
Copy link
Member Author

erikd commented Apr 17, 2024

Not suggesting this per-se, but would it be possible to "fake" this particular set of hooks by wrapping ghc itself?

The problem with wrapping GHC is as follows. Consider two projects, one where you do want to use hooks and one where you don't. How does a wrapped GHC tell the difference? With the hooks version the difference is inherent.

@geekosaur
Copy link
Collaborator

I find myself wondering how that would interact with other things that wrap ghc (such as HLS) as well.

@gbaz
Copy link
Collaborator

gbaz commented Apr 17, 2024

The problem with wrapping GHC is as follows. Consider two projects, one where you do want to use hooks and one where you don't. How does a wrapped GHC tell the difference? With the hooks version the difference is inherent.

Well you pass the ghc per-project with the --with-ghc flag, which can go into a project file, just like a build hook can, I imagine.

I'm open to the hooks concept in general, but it seems like in this use case the "real" ask is "a ghc with high-grade distributed caching as part of the build process" so it feels more natural to me to 'fake' that directly.

@erikd
Copy link
Member Author

erikd commented Apr 17, 2024

Well you pass the ghc per-project with the --with-ghc flag, which can go into a project file, just like a build hook can, I imagine.

The --with-ghc option in the project file would not work for us because all our projects (about 8 of them) build with multiple versions of ghc from 8.10.7 through to 9.8.1. We even build in CI with at least two versions of GHC.

@angerman
Copy link
Collaborator

The bigger issue with --with-ghc is that it just globally gives the compiler to cabal. If you end up being called from cabal, you still don't know why. It could be for building, or for something else (cabal also likes to interrogate it's tools).

@andreabedini
Copy link
Collaborator

IMHO the best approach would be to better separate planning and building phases, so that you could swap the build project with your own (or part of it, or add hooks, or whatever).
That said, this is not a new idea and I think recent refactors made it easier to implement.

I think there are some considerations to make [1] but we should not bikeshed it to death.
@erikd I understand you have poc already, could you open a PR (even draft) so we can discuss the details?

e.g. I assume this is build as in ./Setup.hs build, compiling Setup.hs is a separate thing.

@erikd
Copy link
Member Author

erikd commented Apr 17, 2024

@andreabedini Yes, currently kicking the tyres on the PoC. The complete cabal patch (against HEAD) is in the original post. I will raise it as a PR.

@erikd
Copy link
Member Author

erikd commented Apr 17, 2024

PR is #9899 . Only a draft, but still needs documentation and changelog entry etc.

@hasufell
Copy link
Member

My main interest in hooks is interfacing with GHCup.

Which phase of cabal is early enough that:

  • we already know what GHC to use for the build
  • we haven't attempted solving or any configuration steps yet

This will allow users to emulate stack behavior, where running cabal build may download the right GHC version and possibly HLS. It would also potentially allow to remove lots of code from the vscode-haskell extension, which hacks around the lack of this in other ways.

@erikd
Copy link
Member Author

erikd commented Apr 18, 2024

Which phase of cabal is early enough that:

These specific hooks I am proposing are not a solution to the problem you want to solve.

@erikd
Copy link
Member Author

erikd commented May 2, 2024

I have been playing around with hooks located in ~/.cabal/hooks/ for couple of weeks and quite frankly they are a pain in the neck.

Hooks should be project specific and not global for the user. For instance if I have some work projects and some personal projects, I may want to have different hooks for each, or have hooks for one and not for the other.

To me it makes more sense for cabal to look in the cabalHooks directory (with this directory being in the same directory as the cabal.project file) although @angerman would prefer the hook directory to be named .cabal/hooks like the way git hooks are located in .git/hooks/.

@erikd
Copy link
Member Author

erikd commented May 6, 2024

After using this for a couple of days, this project local version is VASTLY better than the user global version.

I have updated the PR: #9899

@ulysses4ever
Copy link
Collaborator

@erikd given your last message on the PR #9899, do you think this issue should be closed as well?

@angerman
Copy link
Collaborator

I've had some discussion with @erikd around security. I'll let him know he should leave a comment here.

@ulysses4ever
Copy link
Collaborator

Thanks @angerman!

@erikd
Copy link
Member Author

erikd commented Jan 16, 2025

@angerman and I have had some discussion about how to make hooks secure. We think it can be made secure enough for me to think its acceptable.

My current idea is that we add a file ~/.cabal/hook-security to the user's home directory. This file look something like:

trust-always:
  /home/user/work/project/cabalHooks/first.sh

trust-sha256:
  /home/user/work/project/cabalHooks/second.sh <sha256>

This hook-security file should be in the users home directory and not part of a git checkout (and hence not updated with a git pull/checkout operation). Hook files not listed in this file would not be trusted and not run.

@angerman
Copy link
Collaborator

Just one UX note on this, which got lost:

Hook files not listed in this file would not be trusted and not run.

Cabal would inform the user about the existing hook in the repo / or changed hash of the hook. And ask if the user wants to trust it interactively if we are in an interactive invocation. (y/N).

There is an untrusted hook: path/to/hook.sh. Do you want to trust it (y)es, (N)o, or (s)how? (y/N/s)

show would just "cat" the hook to stdout. And ask the question again.

@erikd
Copy link
Member Author

erikd commented Jan 17, 2025

There is an untrusted hook: path/to/hook.sh. Do you want to trust it (y)es, (N)o, or (s)how? (y/N/s)

show would just "cat" the hook to stdout. And ask the question again.

cabal should probably avoid being interactive for cases like when it is being run in CI. C has a function isatty() to detect if a program is running interactively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests