Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to write Kubernetes resource YAMLs to disk #79

Open
camertron opened this issue Dec 19, 2021 · 3 comments
Open

Add option to write Kubernetes resource YAMLs to disk #79

camertron opened this issue Dec 19, 2021 · 3 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@camertron
Copy link
Member

Whether you're using git-ops, want to version your k8s resources in source control, or just want to save them to a directory, Kuby should support emitting them via the CLI. Perhaps something like kuby resources -o /path/to/output_dir? There are a couple of strategies we could support as well:

  1. Emit a single large YAML file containing all the resources.
  2. Emit one resource per file. Filenames would need to include the namespace and name of the resource, eg foo-production.foo-deployment.yml or some such.
  3. Emit a directory per namespace.
  4. Emit a directory per resource type, eg. all deployments in their own folder, etc.
@camertron camertron added enhancement New feature or request good first issue Good for newcomers labels Dec 19, 2021
@kingdonb
Copy link
Contributor

I made some progress on this use case for my own self, but I haven't been motivated to upstream anything yet...

https://github.com/kingdonb/kuby_test

https://github.com/kingdonb/kuby_test/blob/42e0871fb886a13637232d060fc2efa70938512f/builder.sh#L7

I settled on this script builder.sh for now, which first emits the Dockerfiles simply to disk files and then runs a bit more involved Ruby script, since I needed to do some YAML parsing and manipulating of the output from kuby resources.

https://github.com/kingdonb/kuby_test/blob/42e0871fb886a13637232d060fc2efa70938512f/builder.rb#L6

I wanted to strip out any secrets and throw them away, first of all, since I'll likely be using a CI process to automatically commit the updated files back to git when they have change, I don't want any un-encrypted secrets committed to git.

(Later, on a separate project, I decided to save the secrets, but in a different file that has been excluded from git. Then they can be handled by an admin, as-needed. Another option would be, excluding the secret from the git commit, until it has been encrypted, then verify the encryption and move it to a secure location. Because the CI process would have to decrypt the secret in order to know if a newly generated copy has been changed or not, I did not pursue this for now...)

I also wasn't able to figure out how to properly configure kuby.rb so that my environment did not emit any cluster resources like ClusterIssuer. I'm sure it would have been easier to remove them at the source by excluding the plugin (?) but I couldn't figure that out as quickly as I could this: https://github.com/kingdonb/scrob-web/blob/6b8da3accfdb78a8e26a3ce97a3103860c602d3d/builder.rb#L37-L44 especially given that I was already in here parsing YAML and excluding secrets, it was easier to just do this again for ClusterIssuers.

In GitOps, tenants frequently do not have cluster-admin outside of a namespace scope, and they generally cannot write any cluster-wide resources. I haven't looked deeply into kuby-core or kube-dsl to know if it's straightforward to tell cluster-wide resources apart from others, but we might want to have a --tenantize option or something similar so that cluster resources are excluded and assumed to be taken care of out-of-band by an administrator, or in another tenant that's not so restricted.

@camertron
Copy link
Member Author

I made some progress on this use case for my own self

That script of yours is really neat :) A good example of how to separate resources, others might find it useful.

I wanted to strip out any secrets and throw them away, first of all, since I'll likely be using a CI process to automatically commit the updated files back to git when they have change, I don't want any un-encrypted secrets committed to git.

One thing that will be challenging when implementing this feature is knowing what use-cases people have, i.e. removing secrets. I don't personally have a need to write resources to disk, so it will be important to find out what people actually need.

On the topic of secrets, how does gitops encrypt them (or does it?) Are there other industry-standards out there? This is an area I know nothing about.

Because the CI process would have to decrypt the secret in order to know if a newly generated copy has been changed or not, I did not pursue this for now

Hmm interesting. What if we hashed the contents and exposed the digest as an annotation on the Secret resource? I'm also wondering if some form of PKI could be used here to sign secrets so they can be verified without being decrypted.

I also wasn't able to figure out how to properly configure kuby.rb so that my environment did not emit any cluster resources like ClusterIssuer.

Ah, there isn't a way do that at the moment. The deployer code in kuby-core treats resources without a namespace as cluster resources, perhaps your script could adopt a similar approach?

I haven't looked deeply into kuby-core or kube-dsl to know if it's straightforward to tell cluster-wide resources apart from others, but we might want to have a --tenantize option or something similar

Hmm I could see including something like that, or perhaps providing a --namespace flag that will only emit resources with the given namespace.

@kingdonb
Copy link
Contributor

kingdonb commented Feb 1, 2022

how does gitops encrypt them (or does it?)

The standard for Flux is SOPS, you can use any solution though (like sealed-secrets controller) – different solutions behave differently, but the standard is either (1) a private key which is kept on the cluster or (2) a KMS key which is granted for use on the cluster, is used to encrypt the data fields in a secret (or the entire secret, including metadata) before it is stored in git.

The encrypted file is stored in the repo and decrypted on-demand. Different solutions approach the question of "where should decryption be allowed" differently – for example, sealed-secrets must be decrypted into the namespace where they were originally encrypted into, or you can disable that behavior in the controller. SOPS recognizes that anyone with the key can decrypt the data, so doesn't offer this feature I guess because it's an artificial limitation that only protects you as long as your keys are strictly access-controlled – if they key is compromised, it's game over. Namespaces won't protect you.

But also, SOPS is currently unmaintained while by comparison, sealed-secrets has had some releases, but it has also had long spans of time with no releases in recent history, so I still find it hard to recommend it instead of SOPS (which I do personally like better.)

So "what is the industry standard" is a tough question to answer definitively, because of the support situation of tools like this and other issues. I will say SOPS is the standard in spite of the issues for now... others are welcome to disagree.

What if we hashed the contents and exposed the digest as an annotation on the Secret resource?

That sounds like some scaffolding that I would expect SOPS to provide, and maybe it's already been done I'm just unaware.

I'm very leery to provide guidance around security tooling because I am not a security professional, in the CNCF project that I work on we've employed some auditors to help us confirm our security posture and going through the experience has made me more aware of how much I don't know. Anyway, point being – just because I don't see how a one way hash of the data can ever be used to compromise the integrity of the encryption, doesn't mean it's so. I wouldn't be the one to suggest that.

So you can solve the problem that way as well; if you only rotate secrets through an intentional process, you know when they have changed because you're changing them, and you don't need to rely on diff to tell you that. I put secrets into a separate directory, or separate git branch, or separate repo entirely so they are isolated, not only for security reasons, but also to separate signal from noise. That way keys can be rotated every hour, if you like, and it will not be seen as noise in the repo.

Then there are also solutions like Vault CSI and external-secrets operator which keep secrets outside of the cluster. I haven't used any of those, but it's possible they are even more popular than the solutions which I have used.

I am of the opinion that secrets should be rotated frequently and as a pragmatist I understand that means it must be done automatically, so I do want to have this conversation. But that's about as deep as my strongly held opinions go for now, other than to say that I am also still one of those people who consider secrets should be handled separately and with white gloves.

treats resources without a namespace as cluster resources,

I was thinking, since we may have access to the CRD, we can read the spec to find out:

https://github.com/jetstack/cert-manager/blob/b5fbabdc6f7ea7302b01b6bee22b3659eecc2a75/deploy/crds/crd-clusterissuers.yaml#L21

In the GitOps model though, there is no guarantee we (the CI process) will have access to the cluster at build time, or permission to read CRDs.

But, in Kuby (where cert-manager is installed through a plugin) we do at least have those manifests on disk I think, so we can read them even if they might not always match what is on the cluster in a scenario like the one I have imagined for tenants.

Maybe what's needed is a general support for providing runtime "middlewares" or postprocessors that run on the output of Kuby, much as my script is doing, but as a supported part of the build pipeline; compare this to Helm's postRenderer that allows Helm users (and also Flux users) to run Kustomize patches or other postrenderers on the output. Of course I'm not thinking of anything quite that formal, simply providing that a block is called with a reference to the rendered docs, parsed by YAML as I have done, would fill the need. (Or, users could just as well do this outside of Kuby as I have done.)

Thanks for ideating with me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants