-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Preparing ingresses" - Issue syncing the code-redirect-2 ingress preventing startup #22667
Comments
Hi @guydog28, could you share the following for your user namespace?
Feel free to redact parts of URLs/other information if it is considered sensitive |
I've started facing the same issue on minikube when start a workspace the second time:
|
I wasn't able to reproduce this issue on minikube v1.27.0 (ingress controller v1.2.1). To test, I started + restarted the Go workspace a few times. |
When I bump into the issue next time, I will provide additional info. |
This continues to be a problem for us. Even though I disabled the admission controller, for some workspaces the operator is constantly trying to update the ingresses in a never ending loop. This includes the 3 default endpoints and 3 of ours (we have many more that don't seem to be an issue. See the patterns in the logs when running
Similar issues in the operator logs:
We have a large team using this and it is really causing a problem. It seems like maybe two different things are trying to modify the workspace object? is this possible? |
I would like to add one more detail. Our cluster shuts down (nodes only) at 8pm every night to conserve resources during off hours. Is there any way there could be stale data stored somewhere that needs to be cleaned up causing these issues? |
You could try checking the Apart from that, are there somehow two che-operators running in the cluster? What is the update it's stuck on (i.e. what field is being changed repeatedly)? |
managedFields isn't there. But I do see that it is "owned" by a devworkspacerouting object. The operators currently being used are: namespace: eclipse-che, operator: che-operator (7.79.0) [same issue was had with 7.77 and 7.75] |
actually, here are the
here is for the
and the
|
I don't think it is repeatedly changing a field. it is repeatedly deleting and creating ingresses.
Notice how the DNS name for the same ingress keeps changing. |
OOOOK, I think I figured out what is happening here. There are no errors anywhere that provide any helpful information, not even in the Our Che OIDC configuration uses email instead of username for the che username. So while a user might have a username of
A user like SO. It seems like we need to remap Che to use Keycloak usernames instead of email addresses. But this is going to make everyone start from scratch since |
We are in the middle of a deployment push, and we have a few newer devs that haven't been able to launch a workspace due to this. Is there a way to modify a DevWorkspace name after creation, say changing |
Since DevWorkspaces are just Kubernetes objects, it's not possible to change their name (name + namespace is their unique identifier). As a workaround, you could use devfiles with short names in their I've created #22774 for now; if we're generating invalid ingresses it's definitely a bug in the operator. |
@amisevsk What recourse do i have for the long usernames? I don't see where in the CheCluster resource I can select a different field from the oauth provider to be the username? Right now, I think Che needs the name and email to configure the user's gitconfig, and by default uses the email for the username, but how do I get Che to use the user's keycloak username for its own username instead of email? This would get me running while we wait for #22774 to be completed and built. |
On the oauth question, I'm not sure -- I haven't looked at that section in a while and mostly work on the operators and editors. @vinokurig Any suggestions here? I looked briefly into how the Che Operator is generating hostnames, and Che is explicitly checking that generated hostnames are valid (i.e. less than 63 characters) and using a different scheme if they are not. I also tested it by creating a devfile that would have hostnames over that limit, and the operator switched to using I'm really confused as to how you're seeing this behavior 🤔. |
Interesting, so as you can see from the logs here #22667 (comment), it does seem to do this, as you can see that the naming structure is different. However, these are the ingresses that are constantly being re-synced and re-generated with different numbers at the end by the operator. So something is off in that path through the code. Looking at these 6 lines from the logs (which was a kubectl get ingresses -w (watch)) You can see it kept updating the DNS for the same ingresses with different number at the end. Here code-redirect-1, -2, -3 respectively became workspacea9a0497336ae43ff-10, -11, -12, and then immediately changed to -7, -8, -9, and that continues to other numbers as it constantly resynced the ingresses.
|
Ah that's the piece I was missing -- I missed that in your log somehow. Turns out this is coming from Go iterating through maps randomly, so when we go through too-long endpoint names, we run into an issue where the number suffix is more-or-less random (here). I've updated the created issue. Still doesn't help us fix your problem in the here-and-now, though. |
Ah yeah. Would be great to get that in a PR for 7.80 though! would really improve the stability of our cluster and allow us to re-enable our ingress-nginx admission controller. This is causing the k8s api to get hammered all the time trying to sync these up. |
I believe 7.80 has been branched already, but I opened a PR to hopefully fix it asap in the mean time: eclipse-che/che-operator#1801 cc: @ibuziuk in case this is something we want to pull into a bugfix |
do we know when the 7.81 release is planned for? and also, will the merged PR be a part of it? |
I believe the 7.81 release branch of Che is planned for mid-week next week, and this fix should be part of that release. I think we decided not to backport it and issue a bugfix since the next minor release was a little over a week away. |
@guydog28 hello, could you please confirm that the issue is fixed in 7.81.0 and can be closed ? |
Describe the bug
Using vanilla kubernetes with ingress-nginx and a custom devfile that has a tools container using the quay.io/devfile/universal-developer-image image, there are consistent issues in the operator syncing the ingress that, at least in my logs, shows as {workspacepodname}-{containername}-13132-code-redirect-2. ingress-nginx install by default comes with a validation admission webhook that errors on this and because it keeps repeating makes the workspace keep retrying "Preparing ingresses" and not progress. Here is the error I see repeated in the logs for the che-operator:
The only way I could get past this was to delete admission webhook, to see if I could get my workspace to start, and it does start now, but I now see these errors in the che operator logs instead and it causes a slower workspace startup, with "Preparing ingresses" flashing over and over but eventually progresses to start:
What would causes this? I would prefer not to have to delete this admission webhook for the whole cluster just to get our che workspace up and running.
Che version
7.75@latest
Steps to reproduce
Expected behavior
No issues syncing ingresses
Runtime
Kubernetes (vanilla)
Screenshots
No response
Installation method
OperatorHub
Environment
Linux, Amazon
Eclipse Che Logs
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: