From 7413f0592f62eb0f1fa5b9add72664a9bf9961bd Mon Sep 17 00:00:00 2001 From: Sambhav Kothari Date: Thu, 15 Apr 2021 22:37:03 +0100 Subject: [PATCH 1/5] Add a build-write flag for layers Signed-off-by: Sambhav Kothari --- text/0000-build-write-flag.md | 116 ++++++++++++++++++++++++++++++++++ 1 file changed, 116 insertions(+) create mode 100644 text/0000-build-write-flag.md diff --git a/text/0000-build-write-flag.md b/text/0000-build-write-flag.md new file mode 100644 index 000000000..8248f82ce --- /dev/null +++ b/text/0000-build-write-flag.md @@ -0,0 +1,116 @@ +# Meta +[meta]: #meta +- Name: Write flag for build layers +- Start Date: 2021-04-15 +- Author(s): [@samj1912](https://github.com/samj1912) +- RFC Pull Request: (leave blank) +- CNB Pull Request: (leave blank) +- CNB Issue: (leave blank) +- Supersedes: (put "N/A" unless this replaces an existing RFC, then link to that RFC) + +# Summary +[summary]: #summary + +This RFC proposes the introduction of a `build-write` flag in addition to the `build` flag. The idea behind this flag is to expose build layers that are both read-write (if `build-write` is set to `true`) or build layers that are read-only (if `build-write` is set to `false`) for subsequent buildpacks. + +# Definitions +[definitions]: #definitions + +Build layers: Layers from a specific buildpack that are available to subsequent buildpacks during the build process. +layerize: The process of converting a layer directory to a tarball which will be used to create the final application image during the export phase. + +# Motivation +[motivation]: #motivation + +## Why should we do this? + +Currently the spec forbids layer modifictions by buildpacks that don't contribute a specific layer. This is not enforced by the lifecycle and leads to cases where subsequent buildpacks can unknowningly modify layer content as a side-effect of executing binaries provided by a build layer. Such unexpected modifications are also really hard to debug and pin-point the case as it can be really difficult to figure out which commands from subsequent buildpacks are causing them. + +On the other hand there are valid use cases for build layers that can be used as a collaborative workspace for multiple buildpacks and need to be cached in subsequent runs. + +## What use cases does it support? + +- Allowing buildpacks to explicitly indicate how the layers provided by them should be used and having this be guaranteed by the lifecycle. + +### Example use case for read-only build layers + +A build process consists of 2 buildpacks - a python buildpack and a pip buildpack + +1. Buildpack #1 provides a python binary that can be used by subsequent buildpacks but is also used in the app image +2. Buildpack #2 uses the python binary to invoke pip to provide further dependencies. + +Unbeknownst to the buildpack author, python has a side-effect on execution where if it's installation directory is writable, it creates new cache objects or updates existing files. As a result, execution of buildpack #2 causes the content of the layers provided by buildpack #1 to change, causing the final layer imageID to change and hampers layer reproducibility and new layers have to be pushed out to the registry. + +This could easily be fixed if buildpack #1 only exposed its build layers as read-only to subsequent buildpacks or if it discarded the modifications made by subsequent buildpacks essentially making it "read-only" in the context of the build process. + +### Example use case for read and write build layers + +There are various use cases when you may want a common workspace that multiple buildpacks can collaborate on during the build process and this workspace needs to be cached, hence it can be a temporary directory. + +In such a case a buildpack could provide a layer that is marked as `build-writable` i.e. subsequent buildpacks can modify its contents and the final state of the layer at the end of the build process is what is exported out. + +## What is the expected outcome? + +The spec and lifecycle is modified to support the above use cases. + +# What it is +[what-it-is]: #what-it-is + +New layer types that dictate when the given layer directories are layerized. For layers marked as `build-write` to be `false`, they should be layerized immediately after the buildpack that provided the layer is finished with its build process. This way any modification made by the future buildpacks would be ignore while still making this layer available to future buildpacks. + +For layers that are marked as `build-write` as true, they should be layerized at the end of the entire build phase of the lifecycle so any changes made by subsequent buildpacks are present in the final app image. + +# How it Works +[how-it-works]: #how-it-works + +The lifecycle will have to layerize the layers marked `build-write` as `false` during the `build` phase of the lifecycle instead of the `export` phase. + +# Drawbacks +[drawbacks]: #drawbacks + +Possibly mixing the concerns of the `build` and the `export` phases. + +# Alternatives +[alternatives]: #alternatives + +- What other designs have been considered? + +We could change the the layer so that it is owned by root or a different user between the build steps of different buildpacks so that subsequent buildpacks cannot modify the layers created by other buildpacks. This would require elevated privileges during the build phase of the lifecycle. + + +- Why is this proposal the best? + +This proposal allows you to achieve the above use cases without changing the expected UID/GID of the layers and without elevated permissions during the build phase. + +- What is the impact of not doing this? + +We could have unknown layer modifications by buildpacks which is currently a spec violation. + +# Prior Art +[prior-art]: #prior-art + +TODO + +# Unresolved Questions +[unresolved-questions]: #unresolved-questions + +The impact this would have on the performance of the export step. The exact details of which parts of the lifecycle build and export phases that need to be changed. + +# Spec. Changes (OPTIONAL) +[spec-changes]: #spec-changes + +Layer content meteadata TOML - s + +```toml +[types] + launch = false + build = false + # New flag + # If build is set to false, this flag will have no effect. + # The default is false. + build-write = false + cache = false + +[metadata] +# buildpack-specific data +``` From 55481973041cbe934ce656ba7e68d35fa48d604b Mon Sep 17 00:00:00 2001 From: Sambhav Kothari Date: Wed, 5 May 2021 16:50:15 +0100 Subject: [PATCH 2/5] Add examples for build-write flag and add a caveat Signed-off-by: Sambhav Kothari --- text/0000-build-write-flag.md | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/text/0000-build-write-flag.md b/text/0000-build-write-flag.md index 8248f82ce..66379e716 100644 --- a/text/0000-build-write-flag.md +++ b/text/0000-build-write-flag.md @@ -16,8 +16,8 @@ This RFC proposes the introduction of a `build-write` flag in addition to the ` # Definitions [definitions]: #definitions -Build layers: Layers from a specific buildpack that are available to subsequent buildpacks during the build process. -layerize: The process of converting a layer directory to a tarball which will be used to create the final application image during the export phase. +- Build layers: Layers from a specific buildpack that are available to subsequent buildpacks during the build process. +- layerize: The process of converting a layer directory to a tarball which will be used to create the final application image during the export phase. # Motivation [motivation]: #motivation @@ -49,6 +49,10 @@ There are various use cases when you may want a common workspace that multiple b In such a case a buildpack could provide a layer that is marked as `build-writable` i.e. subsequent buildpacks can modify its contents and the final state of the layer at the end of the build process is what is exported out. +One example might be an Android SDK buildpack that sets up ANDROID_SDK_ROOT (pointing to a layer), which a Gradle buildpack may later write into when gradle runs. + +Another example can be a CCache buildpack that sets up `CCACHE_DIR` (pointing to a layer), which a CMake buildpack may use for its build cache. + ## What is the expected outcome? The spec and lifecycle is modified to support the above use cases. @@ -68,7 +72,8 @@ The lifecycle will have to layerize the layers marked `build-write` as `false` d # Drawbacks [drawbacks]: #drawbacks -Possibly mixing the concerns of the `build` and the `export` phases. +- Possibly mixing the concerns of the `build` and the `export` phases. +- Added complexity of more flags for users. # Alternatives [alternatives]: #alternatives From e4b78ec10258bbba933c9ca0161ffc4a87036fb8 Mon Sep 17 00:00:00 2001 From: Sambhav Kothari Date: Wed, 19 May 2021 18:31:19 +0100 Subject: [PATCH 3/5] Move to a read only layers RFC Signed-off-by: Sambhav Kothari --- ...flag.md => 0000-read-only-build-layers.md} | 41 ++++--------------- 1 file changed, 9 insertions(+), 32 deletions(-) rename text/{0000-build-write-flag.md => 0000-read-only-build-layers.md} (63%) diff --git a/text/0000-build-write-flag.md b/text/0000-read-only-build-layers.md similarity index 63% rename from text/0000-build-write-flag.md rename to text/0000-read-only-build-layers.md index 66379e716..b828f69eb 100644 --- a/text/0000-build-write-flag.md +++ b/text/0000-read-only-build-layers.md @@ -1,6 +1,6 @@ # Meta [meta]: #meta -- Name: Write flag for build layers +- Name: Make build layers read only - Start Date: 2021-04-15 - Author(s): [@samj1912](https://github.com/samj1912) - RFC Pull Request: (leave blank) @@ -11,7 +11,9 @@ # Summary [summary]: #summary -This RFC proposes the introduction of a `build-write` flag in addition to the `build` flag. The idea behind this flag is to expose build layers that are both read-write (if `build-write` is set to `true`) or build layers that are read-only (if `build-write` is set to `false`) for subsequent buildpacks. +This RFC proposes that build layers should be read-only for subsequent buildpacks. + +> NOTE: This RFC should be implemented as an atomic change alongside RFC https://github.com/buildpacks/rfcs/pull/163 # Definitions [definitions]: #definitions @@ -43,16 +45,6 @@ Unbeknownst to the buildpack author, python has a side-effect on execution where This could easily be fixed if buildpack #1 only exposed its build layers as read-only to subsequent buildpacks or if it discarded the modifications made by subsequent buildpacks essentially making it "read-only" in the context of the build process. -### Example use case for read and write build layers - -There are various use cases when you may want a common workspace that multiple buildpacks can collaborate on during the build process and this workspace needs to be cached, hence it can be a temporary directory. - -In such a case a buildpack could provide a layer that is marked as `build-writable` i.e. subsequent buildpacks can modify its contents and the final state of the layer at the end of the build process is what is exported out. - -One example might be an Android SDK buildpack that sets up ANDROID_SDK_ROOT (pointing to a layer), which a Gradle buildpack may later write into when gradle runs. - -Another example can be a CCache buildpack that sets up `CCACHE_DIR` (pointing to a layer), which a CMake buildpack may use for its build cache. - ## What is the expected outcome? The spec and lifecycle is modified to support the above use cases. @@ -60,20 +52,17 @@ The spec and lifecycle is modified to support the above use cases. # What it is [what-it-is]: #what-it-is -New layer types that dictate when the given layer directories are layerized. For layers marked as `build-write` to be `false`, they should be layerized immediately after the buildpack that provided the layer is finished with its build process. This way any modification made by the future buildpacks would be ignore while still making this layer available to future buildpacks. - -For layers that are marked as `build-write` as true, they should be layerized at the end of the entire build phase of the lifecycle so any changes made by subsequent buildpacks are present in the final app image. +Updates to the lifecycle so that `build` layers created by buildpacks should be layerized immediately after the buildpack that provided the layer is finished with its build process. This way any modification made by the future buildpacks would be ignore while still making this layer available to future buildpacks. # How it Works [how-it-works]: #how-it-works -The lifecycle will have to layerize the layers marked `build-write` as `false` during the `build` phase of the lifecycle instead of the `export` phase. +The lifecycle will have to layerize the layers marked `build` as `true` during the `build` phase of the lifecycle instead of the `export` phase. # Drawbacks [drawbacks]: #drawbacks - Possibly mixing the concerns of the `build` and the `export` phases. -- Added complexity of more flags for users. # Alternatives [alternatives]: #alternatives @@ -101,21 +90,9 @@ TODO The impact this would have on the performance of the export step. The exact details of which parts of the lifecycle build and export phases that need to be changed. +How to handle the cases when users want a common build workspace that is cached. This will be covered in a subsequent RFC and that RFC and this RFC should be implemented as an "atomic" change to avoid existing use cases from being broken without a proper migration path. + # Spec. Changes (OPTIONAL) [spec-changes]: #spec-changes -Layer content meteadata TOML - s - -```toml -[types] - launch = false - build = false - # New flag - # If build is set to false, this flag will have no effect. - # The default is false. - build-write = false - cache = false - -[metadata] -# buildpack-specific data -``` +None. \ No newline at end of file From 041b1a51bf7ffbdf568a5787088b8f089b4d8ea2 Mon Sep 17 00:00:00 2001 From: Sambhav Kothari Date: Thu, 1 Jul 2021 08:09:59 +0100 Subject: [PATCH 4/5] Clarify output location for exported layers Signed-off-by: Sambhav Kothari --- text/0000-read-only-build-layers.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-read-only-build-layers.md b/text/0000-read-only-build-layers.md index b828f69eb..a08a9a99d 100644 --- a/text/0000-read-only-build-layers.md +++ b/text/0000-read-only-build-layers.md @@ -59,6 +59,8 @@ Updates to the lifecycle so that `build` layers created by buildpacks should be The lifecycle will have to layerize the layers marked `build` as `true` during the `build` phase of the lifecycle instead of the `export` phase. +The layerized output from the `build` phase of lifecycle could be stored at `//@exported` (which should not clash with any buildpack IDs) where ``. The exporter will just read from this to construct the final image. + # Drawbacks [drawbacks]: #drawbacks @@ -92,7 +94,7 @@ The impact this would have on the performance of the export step. The exact deta How to handle the cases when users want a common build workspace that is cached. This will be covered in a subsequent RFC and that RFC and this RFC should be implemented as an "atomic" change to avoid existing use cases from being broken without a proper migration path. -# Spec. Changes (OPTIONAL) +# Spec. Changes [spec-changes]: #spec-changes None. \ No newline at end of file From 6cbf750deda7498efc5d21904e34b7f97ab838e7 Mon Sep 17 00:00:00 2001 From: Sambhav Kothari Date: Thu, 22 Jul 2021 12:38:45 +0100 Subject: [PATCH 5/5] Add rfc feedback and add alternative proposal for different buildpack UIDs Signed-off-by: Sambhav Kothari --- text/0000-read-only-build-layers.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/text/0000-read-only-build-layers.md b/text/0000-read-only-build-layers.md index a08a9a99d..36b1b133f 100644 --- a/text/0000-read-only-build-layers.md +++ b/text/0000-read-only-build-layers.md @@ -57,10 +57,13 @@ Updates to the lifecycle so that `build` layers created by buildpacks should be # How it Works [how-it-works]: #how-it-works -The lifecycle will have to layerize the layers marked `build` as `true` during the `build` phase of the lifecycle instead of the `export` phase. +The lifecycle will have to layerize all the necessary buildpack layers (either `launch = true` or `cache = true`) during the `build` phase of the lifecycle instead of the `export` phase. The layerized output from the `build` phase of lifecycle could be stored at `//@exported` (which should not clash with any buildpack IDs) where ``. The exporter will just read from this to construct the final image. +For example the output could look like `/@exported//.tar` and `/@exported//.diffId` + + # Drawbacks [drawbacks]: #drawbacks @@ -73,6 +76,7 @@ The layerized output from the `build` phase of lifecycle could be stored at `/