Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: image cache resource #27

Open
dtantsur opened this issue Apr 2, 2024 · 6 comments
Open

RFE: image cache resource #27

dtantsur opened this issue Apr 2, 2024 · 6 comments
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Indicates an issue is ready to be actively worked on.

Comments

@dtantsur
Copy link
Member

dtantsur commented Apr 2, 2024

Currently, the operator runs IPA downloader as an init container. This has a few problems:

  • It's not trivial to opt out of using it (OpenShift, for instance, does not use IPA downloader)
  • Downloading IPA takes time which delays starting Ironic (although IPA is not actually required on start-up)
  • It's not possible to cache IPA for longer time or reuse it between separate Ironic deployments.
  • Agent download parameters complicate the already large Ironic CRD
  • In the future, we may want to provide an ability to also cache or even build instance images

Let us have a new CRD IronicImageCache that will download IPA in an init container and serve it via httpd.

type IronicImageCacheSpec struct {
	// AgentBranch is the branch of IPA to download. The main branch is used by default.
	// +optional
	AgentBranch string `json:"agentBranch,omitempty"`

	// AgentDownloadURL is the base URL from which IPA should be downloaded.
	// The default value should be good for most users.
	// +optional
	AgentDownloadURL string `json:"agentDownloadURL,omitempty"`

	// AgentDownloaderImage is the image to be used at pod initialization to download the IPA ramdisk.
	// +kubebuilder:default=quay.io/metal3-io/ironic-ipa-downloader
	// +optional
	AgentDownloaderImage string `json:"agentDownloaderImage,omitempty"`

        // AgentMinimumModificationTime is the time after which the IPA image must be produced.
        // If not provided, a hardcoded version-specific default is used.
	// +optional
        AgentMinimumModificationTime *metav1.Time `json:"agentMinimumModificationTime,omitempty"`
}

In the initial version, upgrades will rely on AgentMinimumModificationTime. Every release of the operator, we will update both the Ironic image version and this date. This version will be passed down to IPA downloader as an environment variable, causing it to re-run on changes. Users may update this value to force an upgrade.

@metal3-io-bot metal3-io-bot added the needs-triage Indicates an issue lacks a `triage/foo` label and requires one. label Apr 2, 2024
@Rozzii
Copy link
Member

Rozzii commented Apr 10, 2024

I would like to go into a different direction, I understand the need for caching and that the current ipa-downloader init container approach is cumbersome.

But I think the caching can be done just fine in the K8s's internal OCI registry or in any OCI registry that is used anyhow for by the K8s instance. I would just make the IPA downloader able to handle OCI artifacts.

When it comes to building an IPA image on the fly, that IMO should be a different topic then the caching, IPA is usually built by OpenstackDIB that is using chroot based process to build the image and I am not sure how reliably can we do it in a containerized environment. But in any case I would really like to separate the caching/storing IPA and building IPA topics.

@dtantsur
Copy link
Member Author

But I think the caching can be done just fine in the K8s's internal OCI registry or in any OCI registry that is used anyhow for by the K8s instance. I would just make the IPA downloader able to handle OCI artifacts.

It feels like an effort that is significantly harder than is here. You don't just need to reach IPA downloader to handle OCI images and registries, you also need to teach Ironic how to fetch an IPA image from such a registry.

When it comes to building an IPA image on the fly, that IMO should be a different topic then the caching

Yeah, I'm not suggesting it here.

@Rozzii
Copy link
Member

Rozzii commented Apr 24, 2024

/triage accepted
We need a design proposal!!!!
I still think OCI registry based caching is the way.

@metal3-io-bot metal3-io-bot added triage/accepted Indicates an issue is ready to be actively worked on. and removed needs-triage Indicates an issue lacks a `triage/foo` label and requires one. labels Apr 24, 2024
@Rozzii Rozzii moved this from Backlog to MISC on hold / blocked in Metal3 - Roadmap Jun 28, 2024
@metal3-io-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues will close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@metal3-io-bot metal3-io-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 23, 2024
@Rozzii
Copy link
Member

Rozzii commented Jul 24, 2024

/remove-lifecycle stale

@metal3-io-bot metal3-io-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 24, 2024
@dtantsur
Copy link
Member Author

/lifecycle frozen

@metal3-io-bot metal3-io-bot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Indicates an issue is ready to be actively worked on.
Projects
Status: MISC on hold / blocked
Development

No branches or pull requests

3 participants