Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buildah build: use the same overlay for the context directory for the whole build #5975

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

nalind
Copy link
Member

@nalind nalind commented Feb 5, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

Change how we handle bind mounts of the build context directory from using a different upper for each read-write mount into it, to using one that lasts the duration of the full build.

We broke workflows which wrote content to the build context in one stage and then attempted to use archive or layout locations in the build context in a subsequent stage when we fixed CVE-2024-11218, and this should get them working again. Because that content was often addressed using relative path names which are no longer correct for the written location, we try to compensate by rewriting some types of references and straight-up rejecting some others.

How to verify it

New integration test! And some updated ones, too!

Which issue(s) this PR fixes:

Proposed fix for #5952.

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Changes "written" to the build context directory during `buildah build` `RUN --mount=type=bind` instructions are no longer discarded between instructions, but when the build completes, whether it completes successfully or not.

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress kind/feature Categorizes issue or PR as related to a new feature. labels Feb 5, 2025
Copy link
Contributor

openshift-ci bot commented Feb 5, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nalind

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for working on this! I just skimmed the code, will try testing it soon

@@ -63,6 +63,13 @@ func MountWithOptions(contentDir, source, dest string, opts *Options) (mount spe
if err := os.Chown(upperDir, int(stat.Uid), int(stat.Gid)); err != nil {
return mount, err
}
times := []syscall.Timeval{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why out of curiosity? Is this helping ensure reproducibility? We aren't serializing anything related to the timestamps of this directory to e.g. a tar stream are we?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without it, two different runs would appear to have different timestamps on the top level of the build context directory, breaking one of the tests added by #5691.

// indicates whether we did, in fact, mount an overlay; a cleanup function
// which should be called when the location is no longer needed (on success);
// and a non-nil fatal error if any of that failed.
func platformSetupContextDirectoryOverlay(store storage.Store, options *define.BuildOptions) (string, string, string, bool, func(), error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return values are a bit unwieldy, maybe a struct instead? Although I guess the single caller unpacks them anyways, so doesn't matter. Yeah, one caller, so makes sense as is.

pkg/cli/build.go Outdated
@@ -113,7 +113,7 @@ func GenBuildOptions(c *cobra.Command, inputArgs []string, iopts BuildOptions) (
if c.Flag("build-context").Changed {
for _, contextString := range iopts.BuildContext {
av := strings.SplitN(contextString, "=", 2)
if len(av) > 1 {
if len(av) > 1 && av[0] != "" && av[1] != "" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically so we skip over -v : or something? This could use a comment, maybe even we have a const DefaultBuildContextKey = "" and then reference it here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushing the check on av[1] down into GetAdditionalBuildContext() makes it a little easier to follow, added a comment here and put more checks in there.

@@ -673,7 +673,7 @@ symlink(subdir)"
_prefetch busybox
run_buildah 125 build -t testbud3 $WITH_POLICY_JSON $BUDFILES/dockerignore3
expect_output --substring 'building.*"COPY test1.txt /upload/test1.txt".*no such file or directory'
expect_output --substring $(realpath "$BUDFILES/dockerignore3/.dockerignore")
expect_output --substring 'filtered out using /[^ ]*/.dockerignore'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me how this test change is related...did the build there try to modify the context directory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the directory which contains the .dockerignore file and which is included in the error message is no longer the one passed on the command line, so the exact string comparison began failing. It was either change the test to not care about the specific path, or lie about the file's location to get the test to pass as it was, so I changed the test.

@nalind nalind force-pushed the overlay-build-context branch 2 times, most recently from 1e2d39c to 82ee991 Compare February 6, 2025 21:11
@@ -916,6 +929,58 @@ func (s *StageExecutor) UnrecognizedInstruction(step *imagebuilder.Step) error {
return errors.New(err)
}

// do our best to ensure that image specifiers that include a transport that
// uses path names are scoped to the build context directory

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be secure and reliable to get a directory fd for the context overlay mount, and then use openat(..., RESOLVE_IN_ROOT) to access the file, then...the ugly part is that the containers-storage APIs want path strings still, but we could pass e.g. oci:/proc/self/fd/N right? I've done that elsewhere I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proper safety for such references is a much larger patch.

@cgwalters
Copy link

Trying to test this, but unrelated to the contents of this PR, I get the same problem trying git main:

$ ./bin/buildah from busybox
busybox-working-container
$ ./bin/buildah run busybox-working-container echo true
error running container: did not get container start message from parent: EOF
Error: setup network: pasta failed with exit code 1:
Couldn't open network namespace /proc/126984/ns/net: Permission denied
DEBU[0000] Running ["/usr/bin/crun" "create" "--bundle" "/var/tmp/buildah1984752272" "--pid-file" "/var/tmp/buildah1984752272/pid" "--no-new-keyring" "buildah-buildah1984752272"] 
DEBU[0000] waiting for parent start message             
DEBU[0000] pasta arguments: --config-net --dns-forward 169.254.1.1 -t none -u none -T none -U none --no-map-gw --quiet --netns /proc/126381/ns/net --map-guest-addr 169.254.1.2 
DEBU[0000] "/var/tmp/buildah1984752272/mnt/buildah-bind-target-10" is apparently not really mounted, skipping 
DEBU[0000] "/var/tmp/buildah1984752272/mnt/rootfs" is apparently not really mounted, skipping 
DEBU[0000] "/var/tmp/buildah1984752272/mnt" is apparently not really mounted, skipping 
error running container: did not get container start message from parent: EOF
DEBU[0000] Error building at step {Env:[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] Command:run Args:[touch /blah] Flags:[] Attrs:map[] Message:RUN touch /blah Heredocs:[] Original:RUN touch /blah}: setup network: pasta failed with exit code 1:
Couldn't open network namespace /proc/126381/ns/net: Permission denied 
Error: building at STEP "RUN touch /blah": setup network: pasta failed with exit code 1:
Couldn't open network namespace /proc/126381/ns/net: Permission denied

Is there something I need to do here to sync up the buildah binary with the default network config? Odd...

@cgwalters
Copy link

Nevermind of course after I posted my "wait these are weird denials" mental flag tripped and of course it's SELinux; a chcon --reference /usr/bin/buildah bin/buildah fixes it.

@cgwalters
Copy link

OK so this works for me:

FROM registry.access.redhat.com/ubi9/ubi:latest  as builder
RUN --mount=type=bind,src=.,rw,dst=/buildcontext \
  dnf -y install skopeo && skopeo copy docker://busybox oci:/buildcontext/out.oci

FROM oci:./out.oci
# Need to reference builder here to force ordering.
RUN --mount=type=bind,from=builder,src=.,target=/var/tmp true

So that's already a lot cleaner, thanks!


There is an important note here that this does break compatibility with the previous dockerfiles, I get:

Error: building at STEP "RUN --mount=type=bind,rw=true,src=.,dst=/buildcontext,bind-propagation=shared dnf -y install skopeo && skopeo copy docker://busybox oci:/buildcontext/out.oci": resolving mountpoints for container "7f7333013ba83fbe05c78aad0ef4edd17663317f4daa775e549ac11d28412a90": rw: must not provide an argument for option

Looks like changing rw=true to just rw fixes things. Not sure if that's intentional or not? It does look like there's no mention of rw=true in the upstream Dockerfile syntax.


Now that I dig in a bit more, one unfortunate thing here is that the need to use the RUN to force ordering...has meant this whole time we're subject to all the "injected content" bugs like #4242 and #5950 etc.

When I unpack and inspect the resulting image, we have this final layer with

drwxr-xr-x 0/0               0 2025-02-06 16:44 etc/
-rwx------ 0/0               0 2025-02-06 16:44 etc/hostname
-rwx------ 0/0               0 2025-02-06 16:44 etc/hosts
-rwx------ 0/0               0 2025-02-06 16:44 etc/resolv.conf
drwxr-xr-x 0/0               0 2025-02-06 16:44 proc/
drwxr-xr-x 0/0               0 2025-02-06 16:44 run/
drwxr-xr-x 0/0               0 2025-02-06 16:44 sys/
drwxr-xr-x 0/0               0 2025-02-06 16:44 var/
drwxr-xr-t 0/0               0 2025-02-06 16:44 var/tmp/

with floating timestamps. That said hmm, actually in this case actually it works to do buildah build --timestamp=<pinned> and that won't also change all the timestamps on the inherited layers. So that's a good mitigation.

But the sad thing is that these things just keep piling on...now for me we have a new one of these in Konflux which is injecting content_sets which also has no attempt made to generate timestamp-reproducible data right now either (and it uses buildah too, so it inherits that, but we still need to ensure the timestamps of the files that we actually write and aren't synthetic container runtime artifacts are canonicalized too)

So in the end, I think we'll definitely try to

@nalind
Copy link
Member Author

nalind commented Feb 6, 2025

There is an important note here that this does break compatibility with the previous dockerfiles, I get:

Error: building at STEP "RUN --mount=type=bind,rw=true,src=.,dst=/buildcontext,bind-propagation=shared dnf -y install skopeo && skopeo copy docker://busybox oci:/buildcontext/out.oci": resolving mountpoints for container "7f7333013ba83fbe05c78aad0ef4edd17663317f4daa775e549ac11d28412a90": rw: must not provide an argument for option

Looks like changing rw=true to just rw fixes things. Not sure if that's intentional or not? It does look like there's no mention of rw=true in the upstream Dockerfile syntax.

Not being strict about the arguments for --mount contributed to CVE-2024-9407. I added more checks in that area in #5925, and they would have landed in v1.39.0.

@nalind nalind force-pushed the overlay-build-context branch 3 times, most recently from 63f0cda to 5b05378 Compare February 13, 2025 19:26
@nalind nalind marked this pull request as ready for review February 13, 2025 20:17
In addition to setting the (usually recently-created) upper directory's
ownership and permissions to match those of the lower (the location of
which wasn't known when the upper was created), set the timestamps to
match, too.

Signed-off-by: Nalin Dahyabhai <[email protected]>
When chown()ing the upper directory to match the lower directory, if the
ownership of the lower directory is the overflow UID:GID, ignore EINVAL.

Signed-off-by: Nalin Dahyabhai <[email protected]>
Mount a read-write overlay directory over the build context directory to
restore the ability to use it as a covert cache of sorts during the
lifetime of the build, but in a way that still ensures that we don't
modify the real build context directory.

N.B.: builds where FROM in one stage referenced a relative path which
had been written to a bind-mounted default build context directory by an
earlier stage broke when we started making those bind mounts into
overlays to prevent/discard modifications to that directory, and while
this extends the lifetime of that overlay so that it's consistent
throughout the build, those relative path names are still going to point
to the wrong location.

Since we need to determine SELinux labeling before mounting the overlay,
go ahead and calculate the labels to use before creating the first
builder, and remove the logic that had whichever stage thought it was
the first one set them in its parent object for use by other stages, in
what was probably a racey way.

Signed-off-by: Nalin Dahyabhai <[email protected]>
Try to limit which image transports we accept in stages, and scope the
ones that use path names to the context directory.  At some point
anything that isn't an image ID or pullable spec should start being
rejected.

Signed-off-by: Nalin Dahyabhai <[email protected]>
Add a missing "not" to an error message.

Signed-off-by: Nalin Dahyabhai <[email protected]>
@nalind nalind force-pushed the overlay-build-context branch from 5b05378 to d349c14 Compare February 18, 2025 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants