-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add zstd:chunked
compression to built images.
#1264
Comments
Idea 1: It should be possible to modify the image index generation to add this functionality (optionally?) if desired. I added functionality last week to expose all image manifests linked to Chains: https://github.com/konflux-ci/build-definitions/pull/1204/files. We will need to be sensitive to the Tekton results limitations. Some users were hitting the limitation for the buildah task when we referenced 3-4 base images in the BASE_IMAGE_DIGESTS. This isn't a direct comparison because the quota for results was split across multiple task steps for buildah. Since this change has already been made to the image index generation, its behavior would be consistent if any other approach is taken instead/in addition to the first idea. Idea 2: This could potentially work. If Chains is able to generate provenance for recursive images as well as sign all image manifests then we could even add these multiple images within a common Image Index, merging all Image Indexes in multi-arch situations. Exposing multiple digests from buildah tasks would be hard because we currently have an assumption of one artifact being produced per pipeline. We would likely need a solution for that assumption before this approach can be implemented. Idea 3: This is likely more feasible than number 2 due to the fact that we wouldn't be returning multiple references from a single task. The output from this task would need to be represented in the Image Index generation, so skipping of the task would mean that there are no results and therefore the image index generation would be skipped since all inputs are not present. This may be possible to work around with some form of matrix build which runs for each compression output desired. Then you would always have some results to be consumed by the image index task. |
What would this look like in the registry? We want an Image Manifest for each platform and each layer compression type. For example:
Then we need to bring those altogether with an Image Index (aka Manifest List). For that, I don't believe we can have multiple entries for the same platform. The registry might allow you to create it, but it would be ambiguous to clients which Image Manifest to pick. Needs testing. The safest approach IMO is to provide an Image Index per compression type. In this case, we would have one Image Index that references the Image Manifest using gzip compression layers, and another Image Index for zstd:chunked compression. We may want to consider this to be configurable for users. Let them decide which compression they want to use, one or both. One approach could be to modify the buildah Task to handle this. It may produce multiple Image Manifests depending on the values of the The Task that builds the Image Index can operate in a similar fashion. Where it could have a parameter for the gzip Image Manifests, We need to think about how the Image Manifests and Image Indexes are tagged in the registry if both compression types are in use. Use a convention? Leave it up to the user? There's also the question about which image ref becomes the one that gets promoted to the application snapshot. |
@rhatdan tells me that clients that do support zstd:chunked will prefer the zstd:chunked manifests if they can find them. The older pre-2023 docker client will select whatever manifest it finds first for its arch, so we need to ensure that the gzip compression option comes first in the list for each platform. |
ack. I couldn't quite get Let's operate under the assumption that a single Image Index is sufficient. Idea 1
As long as the new digests appear somewhere in a type-hint result of a TaskRun or the PipelineRun, Chains should sign them and add them to the SLSA Provenance. To clarify, Chains only ever generates a single SLSA Provenance for a PipelineRun. If adds all the image references to the Idea 2 This seems similar to Idea 1 but the I think between the two, I like this one better because it's more isolated. The conversion happens when the image is built. (Also, I couldn't get I don't necessarily think that emitting multiple image references from a single buildah Task is a problem. AFAIK, Konflux only cares about the image reference on the PipelineRun result. As long as that is still a single ref, i.e. Image Index, then it doesn't really matter how many image refs the TaskRuns emit. NOTE: With any of these approaches there's likely some Pipeline changes required by the users. We should try to make things as backwards-compatible as possible. Idea 3 A dedicated Task for this seems overkill. What if someone only wants images with |
Idea 2 Ah, right. The single reference is for the pipeline result, not a task result. The previous issue that we encountered with generating an image index at build time was resolved by #1147 (i.e. changing it to only returning the image manifest).
Making this change might only require the addition of the new image references to the |
There's a long arc of changes in container tooling to support
zstd:chunked
compression that supports better compression and better download capabilities in clients. However, old versions of docker (pre-2023) don't support it, which has held up adoption.The request here is to add
zstd:chunked
compression in addition to the normalgzip
compression for images when we push an OCI image index - so that new clients can take advantage of it, and old clients don't break. To illustrate, for an ordinary multi-arch image index, you would have 4 entries, one for each arch. The idea here is to have 8 entries, one for each arch with gzip compression, and one for each arch with zstd:chunked compression.Idea number 1: I looked, and
buildah manifest push
has a--add-compression zstd:chunked
option which (I think) will automatically add the extra image manifests on push.However, I think we might have issues with image provenance. In a Konflux multi-arch build pipeline, we build each arch separately in a buildah task. Those produce image manifests, and their digests are recorded and provenance information is associated with them. We then produce the image index, record its digest, and record stuff there too.
When we validate an image at release time, we ensure that we can find provenance for both the image index as well as all of the referenced image manifests.
If we "magically" add an extra four manifests when we create the image index, then our policy check will fail unless their digests and pullspecs are somehow also exposed, so that chains can attest to their origin.
Idea number 2: Instead, perhaps our buildah tasks should be modified to first push the image with gzip compression, and then call
buildah push
again with a--compression-format zstd:chunked
option to generate the second image manifest then, earlier in the process. Our buildah task would need to expose both digests.. hm.Idea number 3: Perhaps we should add a new intermediary task that accepts the digest of the image-manifest producing buildah task as a param, pulls it, and then re-pushes it with
zstd:chunked
. This new task would expose the pullspec and digest of the zstd:chunked manifest as a result, which will induce chains to record provenance for it. The task that builds the image index would then take 8 inputs instead of 4. All 8 would be explicitly added to the image index.The text was updated successfully, but these errors were encountered: