-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubearmor fetches wrong pod name/namespace to apply whitelist policy on long running GKE clusters (BPF-LSM enforcement) #1780
Comments
Hey @gusfcarvalho I want to try reproducing this issue, can you inform me about the scale of the cluster like number of nodes and number of pods/node. So that we can replicate this scenario, because this is not reproducible in our normal test clusters. If by chance possible, Can you redact sensitive information and share logs for kubearmor pods. Thanks! |
we see this issue with a cluster with 8/9 nodes; running about 60 pods across 10 namespaces. We also see it on bigger clusters as well. The main pain point is that it works on a 'fresh' pod - it takes about a few days or so for the issue to kick in (so I would expect it to not be seen on any e2e tests) |
sure! this issue is so recurring that's actually easy to fetch them. Do you need logs from from what I can see, pod logs only contain several of:
the only information that I can see is performing wrong is with |
This issue is still persisting. Any updates? Feel free to dm me in kubernetes slack on @gusfcarvalho |
Hello! 😄 any updates on this? |
I'm having a similar issue. I have a policy with a selector matching specific labels within a specific namespace. Two days later, I began to see relay alert logs pointing to policy violations in a different namespace, and on containers with no labels matching the selector. I'm running kubearmor |
I deleted all the pods in the kube armor namespace, and some hours later, it was already misapplying policies. A very simple policy:
After recreating the pods, all bpf-containerd pods log this:
However, a few hours later, I get these logs:
I have no other policy named |
It looks like the bug is right here? KubeArmor/KubeArmor/core/kubeUpdate.go Line 1016 in 7e7b1c3
It adds the policy to the return if this returns true ...but that function doesn't check whether the passed policy is a cluster policy, and when the Should then I may be missing something. |
Actually, for what I can understand from I'm trying to test that change but I'm having issues with the build using the |
I passed the build issues and created an image with this proposed fix, and deployed it internally. I'll report back if that solves my scenario. |
Regular policies are mistakenly applied at cluster level over time Fixes: kubearmor#1780 Signed-off-by: Carlos Rodriguez-Fernandez <[email protected]>
Regular policies are mistakenly modified and applied at cluster level over time Fixes: kubearmor#1780 Signed-off-by: Carlos Rodriguez-Fernandez <[email protected]>
Regular policies are mistakenly modified and applied at cluster level over time. The `if` condition in `GetSecurityPolicies(..)` returns true if `matchClusterSecurityPolicyRule(..)` evaluates to `true`. That function doesn't check whether the passed policy is a cluster policy, and when the `matchExpressions` is empty, it ends up adding one namespace (whatever comes back in the k8s client response first that hasn't been added yet) to NamespaceList of all existing policies (cluster or not), it then returns `true` and the policy is added to the `GetSecurityPolicies(..)` response. Over time, as `matchClusterSecurityPolicyRule(..)` is called, the list of `NamespaceList` in each policy keeps increasing, by one ns at a time, which explains the fact that it may take time to display this behavior. The cluster policies are built already with the `NamespaceList` properly initialized, so the function `matchClusterSecurityPolicyRule(..)` is just removed, letting the `|| kl.ContainsElement(..)` already present in the `if` condition do the cluster policy matching. Fixes: kubearmor#1780 Signed-off-by: Carlos Rodriguez-Fernandez <[email protected]>
Hey @carlosrodfern thnx for the detailed explanation here. You are correct there is bug in this check Thus we will not recommend to remove the check, rather add a condition in BTW on a side note I don't think this was the bug that was causing the problem for @gusfcarvalho, bcz he is using |
Thank you @Prateeknandle for looking into this. I'll be creating a separate issue and correcting the PR. |
Bug Report
General Information
Issue
This is a consistent issue when running kubearmor on any long-lived cluster. We have a set of policies in
protected-namespace
where we whitelist only a few pods based on labels:Config
Symptom
A pod from
unprotected-namespace
cannot run due to not being able to runcorrect-binary: permission-denied
Extra information
With this configuration, after a time with kubearmor running on cluster, kubearmor starts to deny applications in
unprotected-namespaces
as well, even though there are noKubearmorPolicy
objects on these namespace:From karmor logs, I could see:
This is recurring on any clsuter we have kubearmor operator running. As a workaround, if we
kubectl delete pods --all -n <kubearmor-namespace>
, the system goes back to running as expected (until a few days/weeks later, the issue restarts).Versions
Kubearmor:
v1.3.4
Cluster: gke
1.27.13-gke.1000000
Expected behavior
I would expect kubearmor to consistently get the information of the correct pods' names and namespaces :).
The text was updated successfully, but these errors were encountered: