Add OCI functions for signing #79

LaurentGoderre · 2024-09-12T18:46:54Z

This creates jq function for validating different OCI content into modular pieces that can be reused. This makes the build YAML file more concise and easy to read.

whalelines

A question about the naming

oci.jq

whalelines

LGTM

tianon

A few comments/notes - I've tried to leave them in per-function batches so they're hopefully easier to digest / discuss. I'm certain I've missed some of the detail on the justifications, so please do ask clarifying questions anywhere I've been unclear! 😅 ❤️

oci.jq

tianon · 2024-09-16T22:10:57Z

oci.jq

+def image_digest($os; $arch):
+  if length != 1 then
+    error("unexpected image index document count: " + (length | tostring))
+  else .[0] end
+  | validate_index_type
+  | .manifests[]
+  | select(.platform.os == $os and .platform.architecture == $arch)
+  | .digest
+;


What's the use case for this image_digest function? It seems like it's dual purpose "validate the underlying image index" and "extract the (platform specific) image manifest digest" ?

I think in the case of buildkit/buildx output, we can be both more specific and more prescriptive here. 🤔

ie, I'm thinking we can validate that the index itself contains at least 1 but at most 2 entries, that the second one specifically is our expected provenance object (via annotations / platform), and then we can safely assume the first must be the image (perhaps with some basic media type / size validation). 🤔

(those are all assumptions we have, but that aren't currently validated anywhere and should be OK if they're broken, but if we're assuming them anyhow, it's valuable to validate them, especially if we're parsing the file anyhow)

^^ Re-up on this question thread (the function changed so it doesn't show up in the diff, but still applies) ❤️

Given the scale of the change for signing, I personally think that would be premature optimization and can easily be refined in a subsequent PR without affecting the function signature.

@LaurentGoderre can you expand on how adding more sanity checks would be a premature optimization? After the SBOM incident, it seems prudent to assert on as much as we know.

ok, how about:

def image_digest($os; $arch): if length != 1 then error("unexpected image index document count: \(length)") else .[0] end | validate_oci_index .manifests | ( map( select(.annotations["vnd.docker.reference.type"] == "attestation-manifest") ) | if length != 1 then error("attestation manifest not found") else .[] end | .annotations["vnd.docker.reference.digest"] | if . == null then error("image reference for attestation manifest not found") else . end ) as $digest | map( select(.digest == $digest) ) | if length == 0 then error("image digest not found in index") else .[] end | if .platform.architecture != $arch then error("architecture does not match expected architecture \($arch)") else . end | if .platform.os != $os then error("os does not match expected os \($os)") else . end | $digest ;

But isn't the order also an assumption?

Not needed anymore since we are doing it in bash

Yes, the order is an assumption, which is why I'm also suggesting we validate that -- it's an assumption in our mental model of the output, if nothing else, and validating that the actual output still matches our mental model of the output is extremely valuable, especially if it's inexpensive. I'd rather have a failing validation (even that makes failing builds) with an easy fix than builds which quietly succeed but don't match our assumptions for how builds are structured because that typically leads to bigger problems further down the stack that are harder to rewind than fixing the failing validation (our recent SBOM issue, for example).

The rest of the system is built on our assumed metal model of the output of these builds, and if the shape (or order) of that output changes, it's valuable for us to have an explicit moment to stop and breathe while we determine whether it's OK to allow the new shape or whether it will cause problems elsewhere and needs a broader change to accommodate properly.

For example, if a new version of buildkit/buildx decides to start using referrers instead of manifests embedded in the index, that will probably exhibit in the OCI output in a way we aren't prepared to handle (at the very least, two entries in index.json), so it's really useful for us to stop that from propagating outwards until we've evaluated how it affects the rest of our system. So in short, having more validation like this helps increase our confidence in upgrading things like the buildkit version.

A good hyper-recent real example of the type of failure/validation I'm talking about is the conditional I chose to use in https://github.com/docker-library/busybox/blob/e41a2394306665fd94abf117076c9ae93a259742/Dockerfile-builder.template#L258 (docker-library/busybox#207) -- the problem only exhibits in 1.37.0 currently, so this could've instead simply compared against 1.37.0 explicitly, but that means that if it isn't fixed in 1.37.1 or 1.38.0, we will not notice until we've gotten all the way to builds failing across all architectures again, so instead it assumes that every version from here forward will have that bug, and if the fix is applied, we'll have a patch that fails to apply, a single (amd64 and thus GitHub CI) build failing, and an obvious place to update it with our new knowledge.

I'm honestly a little bit concerned by "we are doing it in bash" because the gist of my previous message above is "doing what?" (trying to get at the root of the problem you're trying to solve so we can determine whether what you had was fine or whether there's perhaps a better solution to it, and if it's a JSON-shaped problem, jq is likely the appropriate solution, but I need to understand the problem we're solving before I can accurately review the proposed solution to it -- it's very possible that the problem you're solving for has enough rough edges that the issues I see with this proposed solution are the lesser evil, but without having that discussion like I'm trying to do here, it's impossible for me to know for sure ❤️)

we are looking for the digest of the manifest that has the layers. By doing it in bash we can skip the traversing of the hierarchy and find the right one directly.

The bash solution works for oci import just as well as as standard builds.

whalelines · 2024-10-02T21:43:21Z

Does "doing it in bash" mean it will be part of a different PR?

LaurentGoderre · 2024-10-02T22:16:29Z

@whalelines @tianon the bash method for finding the image digest is in the draft PR for the next step here:
https://github.com/docker-library/meta/pull/41/files#diff-5c3fa597431eda03ac3339ae6bf7f05e1a50d6fc7333679ec38e21b337cb6721R222-R233

Instead of trying to find the manifest that refers to the image, it looks for the OCI media type of the image and gets the digest via the path.

yosifkit · 2024-10-03T23:02:28Z

Just one minor bit of feedback is that the indentation should be consistently tabs instead of a mix across the file.

LaurentGoderre · 2024-10-04T16:44:07Z

@yosifkit fixed

tianon · 2024-10-09T17:12:35Z

I had a few comments, but they were almost all minor whitespace suggestions, so I've just applied them in the force push above (https://github.com/docker-library/meta-scripts/compare/7e417acb3a6d5c4577e31b848f0e8471a3312193..a1186bdd958a77bb567a214aa777c5483671647b, https://github.com/docker-library/meta-scripts/compare/7e417acb3a6d5c4577e31b848f0e8471a3312193..a1186bdd958a77bb567a214aa777c5483671647b?w=1).

That also includes some adjustment to validate_oci_layout_index so that it outputs the original index again as-is like the other methods (which I think is a property it had before, and a property you've worked to keep in the other methods, so I did it there too even though it's not critical just for consistency).

tianon · 2024-10-09T17:14:09Z

It would be nice if we could think of an easy way to test these here without having too many new artifacts committed, but they're a little bit outside of what we can easily test with our existing framework/data, so I'm OK with leaving that out for now. 👍

tianon · 2024-10-09T17:14:38Z

(oh, I also fixed the "fort" typo that still persisted in the commit message 😄)

tianon · 2024-10-09T17:14:57Z

Your patience and persistence are appreciated. 👍

tianon · 2024-10-09T17:15:48Z

the bash method for finding the image digest is in the draft PR for the next step here: https://github.com/docker-library/meta/pull/41/files#diff-5c3fa597431eda03ac3339ae6bf7f05e1a50d6fc7333679ec38e21b337cb6721R222-R233

Instead of trying to find the manifest that refers to the image, it looks for the OCI media type of the image and gets the digest via the path.

I'm very concerned by this implementation (effectively scanning a directory and pseudo-parsing JSON via grep/regex), but I guess we'll cross that bridge in a later PR. 🙈

LaurentGoderre requested review from tianon and yosifkit as code owners September 12, 2024 18:46

whalelines changed the title ~~Add OCI functions fort signing~~ Add OCI functions for signing Sep 12, 2024

LaurentGoderre force-pushed the oci-jq-fn branch from e0e9e8a to 1e53a68 Compare September 12, 2024 19:04

whalelines reviewed Sep 13, 2024

View reviewed changes

oci.jq Outdated Show resolved Hide resolved

LaurentGoderre force-pushed the oci-jq-fn branch from 1e53a68 to 9813283 Compare September 13, 2024 15:41

LaurentGoderre commented Sep 13, 2024

View reviewed changes

oci.jq Show resolved Hide resolved

whalelines previously approved these changes Sep 16, 2024

View reviewed changes

tianon reviewed Sep 16, 2024

View reviewed changes

LaurentGoderre dismissed whalelines’s stale review via b2d753c September 17, 2024 15:06

LaurentGoderre force-pushed the oci-jq-fn branch from 9813283 to b2d753c Compare September 17, 2024 15:06

LaurentGoderre mentioned this pull request Sep 23, 2024

Isolated sign docker-library/meta#41

Draft

LaurentGoderre force-pushed the oci-jq-fn branch 2 times, most recently from b804206 to a723aed Compare October 1, 2024 18:13

LaurentGoderre force-pushed the oci-jq-fn branch from a723aed to 7e417ac Compare October 4, 2024 16:43

Add OCI functions for signing validation

a1186bd

tianon force-pushed the oci-jq-fn branch from 7e417ac to a1186bd Compare October 9, 2024 17:11

tianon approved these changes Oct 9, 2024

View reviewed changes

tianon merged commit 2084a63 into docker-library:main Oct 9, 2024
1 check passed

LaurentGoderre deleted the oci-jq-fn branch October 9, 2024 17:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OCI functions for signing #79

Add OCI functions for signing #79

LaurentGoderre commented Sep 12, 2024 •

edited

Loading

whalelines left a comment

whalelines left a comment

tianon left a comment

tianon Sep 16, 2024

tianon Sep 20, 2024

LaurentGoderre Sep 23, 2024

whalelines Sep 24, 2024

LaurentGoderre Sep 25, 2024 •

edited

Loading

LaurentGoderre Oct 1, 2024

LaurentGoderre Oct 1, 2024

tianon Oct 1, 2024

LaurentGoderre Oct 2, 2024

LaurentGoderre Oct 2, 2024

whalelines commented Oct 2, 2024

LaurentGoderre commented Oct 2, 2024 •

edited

Loading

yosifkit commented Oct 3, 2024

LaurentGoderre commented Oct 4, 2024

tianon commented Oct 9, 2024

tianon commented Oct 9, 2024

tianon commented Oct 9, 2024

tianon commented Oct 9, 2024

tianon commented Oct 9, 2024

Add OCI functions for signing #79

Add OCI functions for signing #79

Conversation

LaurentGoderre commented Sep 12, 2024 • edited Loading

whalelines left a comment

Choose a reason for hiding this comment

whalelines left a comment

Choose a reason for hiding this comment

tianon left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LaurentGoderre Sep 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

whalelines commented Oct 2, 2024

LaurentGoderre commented Oct 2, 2024 • edited Loading

yosifkit commented Oct 3, 2024

LaurentGoderre commented Oct 4, 2024

tianon commented Oct 9, 2024

tianon commented Oct 9, 2024

tianon commented Oct 9, 2024

tianon commented Oct 9, 2024

tianon commented Oct 9, 2024

LaurentGoderre commented Sep 12, 2024 •

edited

Loading

LaurentGoderre Sep 25, 2024 •

edited

Loading

LaurentGoderre commented Oct 2, 2024 •

edited

Loading