Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed approach for build steps in deps which are not in make node #1236

Open
mhdawson opened this issue Feb 22, 2024 · 10 comments
Open

Proposed approach for build steps in deps which are not in make node #1236

mhdawson opened this issue Feb 22, 2024 · 10 comments

Comments

@mhdawson
Copy link
Member

mhdawson commented Feb 22, 2024

Basic principles

  1. Pulling specific versions of tools to to build dependencies from npm is ok
    npm guarantees immutability, and guarantees to give you the same thing for a given version

  2. Code that gets pulled into Node.js deps

  • Ideally source code is built/transformed in Node.js build (make node)

  • Should be able to build from canonical source

    • Project should have copy of canonical source, GitHub is not immutable, and no guarantee that is is available later
    • Project should be able to float patch on dep if needed
      • Should be able to reproduce same result
        • Need to know and have access to specific versions of tools
      • No dynamic downloads
  • Couple of categories that are currently worrying in terms of these principles

    • WASM blobs
    • Minified JavaScript
    • TypeScript

Proposed approach for build steps/transforms that cannot done in make node

  • Source for transform/build steps outside of make node is is location/repo under the control of the Node.js project
    • Could be:
      • deps subdirectory
      • fork of repo
      • copy of tarball
      • etc.
    • updater will store copy of canonical source in one of the above, build will be done from versioned copies under control of project.
  • All build/transform steps run in Docker container
  • Docker container contains all tools, etc., needed other than npm dependencies
    • We store copy of container in location under control of project
    • We store copy of dockerfile used to build container along with container
  • Build step run by updater uses known container, known canonical source

Next Steps:

  • Figure out how we would store containers
  • Figure out how we would store tarballs
  • Look at undici and outline how we would implement proposed approach above
@mhdawson
Copy link
Member Author

From discussion in last session related to auditing build deps with @marco-ippolito, and earlier discussion with @richardlau as well.

@mhdawson
Copy link
Member Author

@nodejs/security-wg, this has been on the agenda a few times, we assume there are no concerns/objections if we don't hear by next week and move to the next step of PR's the the basic principles into Node.js doc to get broader feedback.

@mhdawson
Copy link
Member Author

mhdawson commented May 2, 2024

Looks like we should be able to use ghcr.io to store container images. https://github.com/features/packages, https://docs.github.com/en/packages/learn-github-packages/introduction-to-github-packages

@mhdawson
Copy link
Member Author

mhdawson commented May 3, 2024

This repo shows some experimentation on createing a repo to generate wasm builder images and storing them in ghcr.io

https://github.com/mhdawson/node-wasm-build/

There is a GitHub action to build and publish. Together with the Dockerfile it tries to capture/keep all key data about what went into the build.

@mhdawson
Copy link
Member Author

mhdawson commented May 3, 2024

Next step will be to look at using the image created to build undici, llhttp instead of building the docker container on the fly as part of the builds for undici, lltttp

@marco-ippolito
Copy link
Member

I agree with the approach, we would need to add to the source code on node a folder with the dockerfile (I assume its going to be one that covers all the dependencies but there could be more) and use it in the github action where we do the dependency update

Copy link
Contributor

This issue has been inactive for 90 days. It will be closed in 14 days unless there is further activity or the stale label is taken off.

@github-actions github-actions bot added the stale label Aug 22, 2024
@mhdawson mhdawson removed the stale label Aug 22, 2024
@mhdawson
Copy link
Member Author

Sorry for not getting to this sooner. Still plan to work on it.

@mhdawson
Copy link
Member Author

Spent some time on this over the last few days.

Newer repo that builds/releases container - https://github.com/mhdawson/wasm-builder

  • uses release-please to manage releases
  • builds multi arch docker contain to ghcr for each release that supports both x86 and arm

Fork/Branch of undici which uses the container to build, should work on arm based macs with docker, but I don't have one to try.

Next I want to look at how the wasm is built for the module lexer to see how/if the same container can be used

@mhdawson
Copy link
Member Author

Ok, I have now built the three dependencies that directly use wasm and a branch with the changes that were need for each of them.

The approach is based on what was being used in undici as it was the most consistent/reproduciable to start in my opinion.

The changes for each of them:

For undici and cjs-module-lexar the WASM built seems to be the same as before.
For amaro the WASM is different, I believe because the sytem installed rust is a lower version than the latest that was being used previously.

The common container is built/published by - https://github.com/mhdawson/wasm-builder which we would move under the Node.js organization if we move forward with this approach.

The advantages include:

  • persistent container so we can rebuild with exactly the same environment/versions of tools if needed at a later time
  • metadata for versions of tools used captured and available (in metadata directory in the container) and captured in a consistent way across dependencies. (and earlier attempt to do this in the undici repo was broken by later updates)
  • hopefully will reduce the different versions/tools that are used across the dependencies to build WASM for a given Node.js release
  • deps can pin versions used based on hash of container they pull

Possible disadvantage

  • Container contains superset of tools need to build any one dep. Right now thinking that is a reasonable tradeoff to only have to build a single wasm-builder container.

One thing to note is that the approach (inherited from what undici was doing) relies on the system packages in alpine for most of the components. The advantage is that you should get a set of tools that work together and they are easy to install. The disadvantage is that you cannot necessarily get the very latest version of a component only the versions which were built for the version of alpine on which the base container is installed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants