Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OCI functions for signing #79

Merged
merged 1 commit into from
Oct 9, 2024

Conversation

LaurentGoderre
Copy link
Member

@LaurentGoderre LaurentGoderre commented Sep 12, 2024

This creates jq function for validating different OCI content into modular pieces that can be reused. This makes the build YAML file more concise and easy to read.

@whalelines whalelines changed the title Add OCI functions fort signing Add OCI functions for signing Sep 12, 2024
Copy link
Collaborator

@whalelines whalelines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question about the naming

oci.jq Outdated Show resolved Hide resolved
oci.jq Show resolved Hide resolved
whalelines
whalelines previously approved these changes Sep 16, 2024
Copy link
Collaborator

@whalelines whalelines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@tianon tianon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments/notes - I've tried to leave them in per-function batches so they're hopefully easier to digest / discuss. I'm certain I've missed some of the detail on the justifications, so please do ask clarifying questions anywhere I've been unclear! 😅 ❤️

oci.jq Outdated Show resolved Hide resolved
oci.jq Outdated Show resolved Hide resolved
oci.jq Show resolved Hide resolved
oci.jq Show resolved Hide resolved
oci.jq Outdated Show resolved Hide resolved
oci.jq Outdated
Comment on lines 130 to 139
def image_digest($os; $arch):
if length != 1 then
error("unexpected image index document count: " + (length | tostring))
else .[0] end
| validate_index_type
| .manifests[]
| select(.platform.os == $os and .platform.architecture == $arch)
| .digest
;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use case for this image_digest function? It seems like it's dual purpose "validate the underlying image index" and "extract the (platform specific) image manifest digest" ?

I think in the case of buildkit/buildx output, we can be both more specific and more prescriptive here. 🤔

ie, I'm thinking we can validate that the index itself contains at least 1 but at most 2 entries, that the second one specifically is our expected provenance object (via annotations / platform), and then we can safely assume the first must be the image (perhaps with some basic media type / size validation). 🤔

(those are all assumptions we have, but that aren't currently validated anywhere and should be OK if they're broken, but if we're assuming them anyhow, it's valuable to validate them, especially if we're parsing the file anyhow)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^ Re-up on this question thread (the function changed so it doesn't show up in the diff, but still applies) ❤️

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the scale of the change for signing, I personally think that would be premature optimization and can easily be refined in a subsequent PR without affecting the function signature.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LaurentGoderre can you expand on how adding more sanity checks would be a premature optimization? After the SBOM incident, it seems prudent to assert on as much as we know.

Copy link
Member Author

@LaurentGoderre LaurentGoderre Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, how about:

def image_digest($os; $arch):
    if length != 1 then
        error("unexpected image index document count: \(length)")
    else .[0] end
    | validate_oci_index
    .manifests
    | (
        map(
            select(.annotations["vnd.docker.reference.type"] == "attestation-manifest")
        )
        | if length != 1 then
            error("attestation manifest not found")
        else .[] end
        | .annotations["vnd.docker.reference.digest"]
        | if . == null then
            error("image reference for attestation manifest not found")
        else . end
    ) as $digest
    | map(
        select(.digest == $digest)
    )
    | if length == 0 then
        error("image digest not found in index")
    else .[] end
    | if .platform.architecture != $arch then
        error("architecture does not match expected architecture \($arch)")
    else . end
    | if .platform.os != $os then
        error("os does not match expected os \($os)")
    else . end
    | $digest
;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't the order also an assumption?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed anymore since we are doing it in bash

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the order is an assumption, which is why I'm also suggesting we validate that -- it's an assumption in our mental model of the output, if nothing else, and validating that the actual output still matches our mental model of the output is extremely valuable, especially if it's inexpensive. I'd rather have a failing validation (even that makes failing builds) with an easy fix than builds which quietly succeed but don't match our assumptions for how builds are structured because that typically leads to bigger problems further down the stack that are harder to rewind than fixing the failing validation (our recent SBOM issue, for example).

The rest of the system is built on our assumed metal model of the output of these builds, and if the shape (or order) of that output changes, it's valuable for us to have an explicit moment to stop and breathe while we determine whether it's OK to allow the new shape or whether it will cause problems elsewhere and needs a broader change to accommodate properly.

For example, if a new version of buildkit/buildx decides to start using referrers instead of manifests embedded in the index, that will probably exhibit in the OCI output in a way we aren't prepared to handle (at the very least, two entries in index.json), so it's really useful for us to stop that from propagating outwards until we've evaluated how it affects the rest of our system. So in short, having more validation like this helps increase our confidence in upgrading things like the buildkit version.

A good hyper-recent real example of the type of failure/validation I'm talking about is the conditional I chose to use in https://github.com/docker-library/busybox/blob/e41a2394306665fd94abf117076c9ae93a259742/Dockerfile-builder.template#L258 (docker-library/busybox#207) -- the problem only exhibits in 1.37.0 currently, so this could've instead simply compared against 1.37.0 explicitly, but that means that if it isn't fixed in 1.37.1 or 1.38.0, we will not notice until we've gotten all the way to builds failing across all architectures again, so instead it assumes that every version from here forward will have that bug, and if the fix is applied, we'll have a patch that fails to apply, a single (amd64 and thus GitHub CI) build failing, and an obvious place to update it with our new knowledge.

I'm honestly a little bit concerned by "we are doing it in bash" because the gist of my previous message above is "doing what?" (trying to get at the root of the problem you're trying to solve so we can determine whether what you had was fine or whether there's perhaps a better solution to it, and if it's a JSON-shaped problem, jq is likely the appropriate solution, but I need to understand the problem we're solving before I can accurately review the proposed solution to it -- it's very possible that the problem you're solving for has enough rough edges that the issues I see with this proposed solution are the lesser evil, but without having that discussion like I'm trying to do here, it's impossible for me to know for sure ❤️)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are looking for the digest of the manifest that has the layers. By doing it in bash we can skip the traversing of the hierarchy and find the right one directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bash solution works for oci import just as well as as standard builds.

@whalelines
Copy link
Collaborator

Does "doing it in bash" mean it will be part of a different PR?

@LaurentGoderre
Copy link
Member Author

LaurentGoderre commented Oct 2, 2024

@whalelines @tianon the bash method for finding the image digest is in the draft PR for the next step here:
https://github.com/docker-library/meta/pull/41/files#diff-5c3fa597431eda03ac3339ae6bf7f05e1a50d6fc7333679ec38e21b337cb6721R222-R233

Instead of trying to find the manifest that refers to the image, it looks for the OCI media type of the image and gets the digest via the path.

@yosifkit
Copy link
Member

yosifkit commented Oct 3, 2024

Just one minor bit of feedback is that the indentation should be consistently tabs instead of a mix across the file.

@LaurentGoderre
Copy link
Member Author

@yosifkit fixed

@tianon
Copy link
Member

tianon commented Oct 9, 2024

I had a few comments, but they were almost all minor whitespace suggestions, so I've just applied them in the force push above (https://github.com/docker-library/meta-scripts/compare/7e417acb3a6d5c4577e31b848f0e8471a3312193..a1186bdd958a77bb567a214aa777c5483671647b, https://github.com/docker-library/meta-scripts/compare/7e417acb3a6d5c4577e31b848f0e8471a3312193..a1186bdd958a77bb567a214aa777c5483671647b?w=1).

That also includes some adjustment to validate_oci_layout_index so that it outputs the original index again as-is like the other methods (which I think is a property it had before, and a property you've worked to keep in the other methods, so I did it there too even though it's not critical just for consistency).

@tianon
Copy link
Member

tianon commented Oct 9, 2024

It would be nice if we could think of an easy way to test these here without having too many new artifacts committed, but they're a little bit outside of what we can easily test with our existing framework/data, so I'm OK with leaving that out for now. 👍

@tianon
Copy link
Member

tianon commented Oct 9, 2024

(oh, I also fixed the "fort" typo that still persisted in the commit message 😄)

@tianon tianon merged commit 2084a63 into docker-library:main Oct 9, 2024
1 check passed
@tianon
Copy link
Member

tianon commented Oct 9, 2024

Your patience and persistence are appreciated. 👍

@tianon
Copy link
Member

tianon commented Oct 9, 2024

the bash method for finding the image digest is in the draft PR for the next step here: https://github.com/docker-library/meta/pull/41/files#diff-5c3fa597431eda03ac3339ae6bf7f05e1a50d6fc7333679ec38e21b337cb6721R222-R233

Instead of trying to find the manifest that refers to the image, it looks for the OCI media type of the image and gets the digest via the path.

I'm very concerned by this implementation (effectively scanning a directory and pseudo-parsing JSON via grep/regex), but I guess we'll cross that bridge in a later PR. 🙈

@LaurentGoderre LaurentGoderre deleted the oci-jq-fn branch October 9, 2024 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants