- OpenShift
- OC CLI
- Start shell in pod
- Login
- export output as template (openshift templates)
- rsync to pod
- port-forward
- Get OC API token
- nodes show-labels
- Apply/process template
- oc --as
- explain api
- api-resources
- field-selector
- curl spec.host
- Get clusterversion
- Get a specific pod log
- Crete taints
- Apply taints on pod
- Taint cluster repo
- Get node ip
- Login oc internal container repo
- podman login file
- View OCP root credentials
- simple go-template
- Namespace/project
- Access
- Give admin access
- Allow root pod for service account
- Change service account for running deploymentconfig
- User manage all projects
- Create user from scratch OCP3
- Removal for auth users to create projects
- List role bindings
- Role applicability
- OC default roles
- Security Context Constraints (SCCs)
- View user access
- View "policys"/rolebindings
- Delete kubeadmin
- rotate service account
- Check SA monitor agent can list pods
- Applications
- Create app using cli multiple options
- Create app using standard repo with a bunch of variabels
- To create an application based on an image from a private registry
- To create an application based on source code stored in a Git repository
- To create an application based on source code stored in a Git repository and referring to an image stream
- Delete app
- Scale application
- Autoscaling HorizontalPodAutoscaler (HPA)
- Restart pod
- Set nodeSelector
- Set nodeSelector way 2
- Cancel failed rollout
- Redeploy application
- Get image name from pod
- Get all none quay image in OCP cluster
- Create app using cli multiple options
- Ansible
- Openshift-install
- Network
- Certificates docker
- Diagnostics
- Storage
- Admin tasks
- Template
- Quoatas
- Limites
- ClusterResources
- Openshift upgrades
- Debug
- installation debugging
- Jenkins pipeline
- OSCP4 specific
- Operators
- Get clusteroperators
- Openshift 4 course overview
- openshift-marketplace
- Machine config
- Day2 stuff
- Image pruning
- Upgrade paths
- Podman
- cli
- OC CLI
oc rsh
oc login https://master.lab.example.com -u developer -p redhat
oc export svc,dc docker-registry --as-template=docker-registry
This feature have been depricated due to reasons. If you want to get rid of similar metadata and status like the export command did you can do the following:
cat filter.jq
del(.. | select(. == "" or . == null or . == "None")) |
walk(if type == "object" then del(.status,.annotations,.creationTimestamp,.generation,.selfLink,.uid,.resourceVersion) else . end) |
del(.. | select(. == {}))
# Run the above parsing on your yaml output. Notice the yq command that is a wrapper on jq.
kubectl get -o=yaml | yq --yaml-output "$(cat filter.jq)"
For more info see PR
OCP 4 This is also depricated, don't know how to do it in the future
oc get --export svc -o yaml
oc rsync :<pod_dir> <local_dir> -c
oc port-forward 3306:3306
TOKEN=$(oc whoami -t)
View the labels of nodes
oc get nodes --show-labels
oc process -f templates/build.yml | oc apply -f-
If you want to perfrom a "sudo" command or a runas (run-as) command Perfrom the following:
oc --as=system:serviceaccount:python-example-build:tekton get imagestreams -n python-example-dev
Get detailed infomration about k8s API.
oc explain pod.spec.containers
You can also perfrom --recursive to get all info bellow.
oc explain machineset.spec --recursive
This will list all avliable api objects in your cluster. Use explain to see how to configure them.
oc api-resources
Is it needed? Grep does a good job but probably good when writing scripts.
oc get pods --field-selector status.phase=Running
Get the hostname of a route
export ROUTE=$(oc get route bluegreen -o jsonpath='{.spec.host}')
curl $ROUTE/version
oc get clusterversion
oc logs -n istio-operator $(oc -n istio-operator get pods -l name=istio-operator --output=jsonpath={.items..metadata.name})
Taint a node manually to do tests, don't forget to update the machineset after.
oc adm taint node infra-1a-t2vsp infra=reserved:NoSchedule
oc adm taint node infra-1a-t2vsp infra=reserved:NoExecute
spec:
nodePlacement:
nodeSelector:
matchLabels:
node-role.kubernetes.io/infra: ""
tolerations:
- effect: NoSchedule
key: infra
value: reserved
- effect: NoExecute
key: infra
value: reserved
For the ingresscontroller
oc patch ingresscontroller default -n openshift-ingress-operator --type=merge --patch='{"spec":{"nodePlacement": {"nodeSelector": {"matchLabels": {"node-role.kubernetes.io/infra": ""}},"tolerations": [{"effect":"NoSchedule","key": "infra","value": "reserved"},{"effect":"NoExecute","key": "infra","value": "reserved"}]}}}'
oc patch config cluster --type=merge --patch='{"spec":{"nodeSelector": {"node-role.kubernetes.io/infra": ""},"tolerations": [{"effect":"NoSchedule","key": "infra","value": "reserved"},{"effect":"NoExecute","key": "infra","value": "reserved"}]}}'
If you want to check the ip of the node where you pod is runinng you can perfrom.
oc get pods -o wide
Check the name of the node where you are running and perform:
oc get nodes -o wide
And match that name with the nodes ip. Or you can write
$ oc get pods $(oc get pod -l app=node-ssh -o jsonpath='{.itea.name}') -o jsonpath='{.status.hostIP}'
sudo podman login -u dosentmatter -p $(oc whoami -t) external-route-url-for-internal-docker-repo
The default route for OCP4 docker registry is: default-route-openshift-image-registry.apps.
Podman dosen't store it's login files in $HOME/.docker/config.json Instea it stores it in $XDG_RUNTIME_DIR/containers/auth.json
Where can i find my cloud credentials? Like AWS, Azure, vspehere, openstack etc.
oc get secrets -n kube-system | grep cred
You will see aws-creds if you run aws.
Sometimes we are in environments that dosen't have jq by default... windows... Then another good option is go-template.
oc get secret prod-db-secret -o go-template --template='{{.data.username}}'
oc projects
oc project default
Projects can have a separate name, display name, and description:
The mandatory name is a unique identifier for the project and is most visible when using the CLI tools or API. The maximum name length is 63 characters.
The optional display name is how the project is displayed in the web console (defaults to name).
The optional description can be a more detailed description of the project and is also visible in the web console.
The following components apply to projects:
Objects : Pods, services, replication controllers, and more.
Policies : Rules that determine which actions users can or cannot perform on objects.
Constraints : Quotas for each kind of object that can be limited.
Bellow is not possible in openshift 3.9, the --admin flag is not avaliable.
oc adm new-project resourcemanagement --admin=andrew --node-selector='node-role.kubernetes.io/compute=true'
oc adm policy add-role-to-user admin <user_name> -n <project_name>
kubectl get secret sg-token-7rclm --template={{.data.token}} |base64 --decode
oc whoami
You have to login on the master node as root to be able to perfrom this command
ssh root@master
oc whoami
# Give access to admin user
oc adm policy add-cluster-role-to-user cluster-admin admin
oc create serviceaccount useroot
oc create role useroot --verb=use --resource=securitycontextconstraints --resource-name=anyuid
oc create rolebinding useroot --role=useroot --serviceaccount=<namespace>:<sa>
Earlier I have been thought that this was a okay way to assign an scc to a user. There is a bug when upgrading from 4.3.8 -> 4.3.10 or something like that. That looks to see if the scc is completley untouched. Even if you just assign a user to a scc you will still "modify" it.
Don't do it, I just have this to remember how I did it and to share how not to do it now!
oc adm policy add-scc-to-user anyuid -z useroot
oc patch dc/nginx --patch '{"spec":{"template":{"spec":{"serviceAccountName": "useroot"}}}}'
This was good enough when getting asked to do the following: "Allow the developer user access to this project. Allow the developer user to create new applications in this project"
oc policy add-role-to-user edit developer -n todoapp
A basic way to create users if you are using htpasswd solution.
ssh root@master
oc create user demo-user
htpasswd /etc/origin/openshift-passwd demo-user
oc policy add-role-to-user edit demo-user
oc adm policy remove-cluster-role-from-group self-provisioner system:authenticated system:authenticated:oauth
oc describe clusterPolicyBindings :default
oc describe policyBindings :default
Command | Description |
---|---|
oc adm policy who-can verb resource | Indicates which users can perform an action on a resource. |
oc adm policy add-role-to-user role username | Binds a given role to specified users. |
oc adm policy remove-role-from-user role username | Removes a given role from specified users. |
oc adm policy remove-user username | Removes specified users and all of their roles. |
oc adm policy add-role-to-group role groupname | Binds a given role to specified groups. |
oc adm policy remove-role-from-group role groupname | Removes a given role from specified groups. |
oc adm policy remove-group groupname | Removes specified groups and all of their roles. |
Default Roles | Description |
---|---|
edit | Users in the role can create, change and delete common application resources from the project, such as services and deployment configurations. But cannot act on management resources such as limit ranges and quotas, and cannot manage access permissions to the project. |
basic-user | Users in the role have read access to the project. |
self-provisioner | Users in the role can create new projects. This is a cluster role, not a project role. |
admin | Users in the role can manage all resources in a project, including granting access to other users to the project. |
SCC which control the actions a pod can perform and what resources it can access.
oc get scc
oc describe scc scc_name
Having issues with this one... oc policy who-can * *
Instead of oc adm policy get rolebinding the syntax is: oc get clusterrolebinding.rbac This is due to rbac came after RHOCP (RedHat Openshift Container Platform)...
oc describe clusterrolebinding.rbac self-provisioner
NOTE: Don't do this before you have created some other admin user.
oc delete secret kubeadmin -n kube-system
It's not possible to create the secret again to create a new password.
To rotate service account token, delete token secret:
$ oc delete secret SECRET -n NAMESPACE
Deleted token immediately disabled in cluster API
Pods using deleted secret need to be restarted
External services need updated credentials
oc get pods -n monitored-project
--as=system:serviceaccount:monitor:monitor-agent
--as-group=system:serviceaccounts:monitor
oc new-app mysql MYSQL_USER=user MYSQL_PASSWORD=pass MYSQL_DATABASE=testdb -l db=mysql
oc new-app --docker-image=myregistry.com/mycompany/myapp --name=myapp
oc new-app https://github.com/openshift/ruby-hello-world --name=ruby-hello
To create an application based on source code stored in a Git repository and referring to an image stream
oc new-app https://mygitrepo/php-hello -i php:7.0 --name=php-hello
There is no delete-new-app command so use the label of the application that you created
oc delete all -l app=node-hello
oc scale --replicas=5 dc myapp
oc autoscale dc/myapp --min 1 --max 10 --cpu-percent=80
Can be done on dc level as well.
oc scale deployment kube-state-metrics --replicas=0
Set label on a pod to only run on a specific node.
oc export dc/version -o yaml > version-dc.yml
Add the following under the second spec: above the containers, think on the indentation nodeSelector: region: apps
Random example
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
env: test
spec:
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
nodeSelector:
disktype: ssd
Apply the changes
oc replace -f version-dc,yml
oc edit dc/version
Perfrom the same changes as above
If s2i isn't rolling out as it should and you want to delete it:
oc rollout cancel dc/
oc rollout latest dc/
oc get pod cakephp-ex-1-2vdtk -o jsonpath='{.spec.containers[*].image}'
or
oc get pod cakephp-ex-1-2vdtk -o jsonpath='{..image}'
Checkout: https://kubernetes.io/docs/tasks/access-application-cluster/list-all-running-container-images/
kubectl get pods --all-namespaces -o=jsonpath='{range .items[]}{"\n"}{.metadata.name}{":\t"}{range .spec.containers[]}{.image}{", "}{end}{end}' |
sort |grep -v quay
sudo yum install atomic-openshift-utils
Look in ansible.cfg to see what inventory files it points to. Prepare the nodes by installing all needed packages, making sure you have some dedicated docker storage.
Set labels on your nodes, example: [nodes] node1.lab.example.com openshift_node_labels="{'region':'infra', 'node-role.kubernetes.io/compute':'true'}"
Under OSEv3:vars you will define openshift release and image tag it should look like this:
openshift_deployment_type=openshift-enterprise openshift:release=v3.9 openshift_image_tag=v3.9.14
NOTE THE STUPID ASS "v"
From where you have a ansible.cfg file perfrom the following:
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/prerequisites.yml
From where you have a ansible.cfg file perfrom the following:
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml
After running deploy_cluster.yml
- Login to the web-console and verify that your username and passowrd works
- NOTE you won't be able to do anything since you are a normal user
- Give admin root access to cluster
- oc get nodes = ready?
- Verify that the docker-registry and router = ready?
- oc get pods -n default
- Verify s2i
- Create a new project using your normal deploy user
- oc new-app --name=hello php:5.6~http://services.lab.example.com/php-helloworld
This will deploy as pods to project openshift-infra
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-metrics/config.yml -e openshift_metrics_install_metrics=True3
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-metrics/config.yml -e openshift_metrics_install_metrics=False
This is OCP 4 specific.
In short follow the instructions in the documentation. When i do a more advanced documentation I will try to put the time in to document it.
Here is a link for aws if you don't want to google.
<installation_folder>/.openshift_install.log
oc get csr
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
Edge Termination
With edge termination, TLS termination occurs at the router, before the traffic gets routed to the pods. TLS certificates are served by the router, so they must be configured into the route, otherwise the router’s default certificate is used for TLS termination. Because TLS is terminated at the router, connections from the router to the endpoints over the internal network are not encrypted.
Pass-through Termination
With pass-through termination, encrypted traffic is sent straight to the destination pod without the router providing TLS termination. No key or certificate is required. The destination pod is responsible for serving certificates for the traffic at the endpoint. This is currently the only method that can support requiring client certificates (also known as two-way authentication).
Re-encryption Termination
Re-encryption is a variation on edge termination, where the router terminates TLS with a certificate, then re-encrypts its connection to the endpoint, which might have a different certificate. Therefore the full path of the connection is encrypted, even over the internal network. The router uses health checks to determine the authenticity of the host.
oc expose svc/hello --hostname=hello.apps.lab.example.com
A wildcard policy allows a user to define a route that covers all hosts within a domain. A route can specify a wildcard policy as part of its configuration using the wildcardPolicy field. The OpenShift router has support for wildcard routes, which are enabled by setting the ROUTER_ALLOW_WILDCARD_ROUTES environment variable to true .
openssl genrsa -out example.key 2048
openssl req -new -key example.key -out example.csr -subj "/C=US/ST=CA/L=Los Angeles/O=Example/OU=IT/CN=test.example.com"
openssl x509 -req -days 366 -in example.csr -signkey example.key -out example.crt
oc create route edge --service=test --hostname=test.example.com --key=example.key --cert=example.crt
Openshift 3
oc env pod --list
Openshift 4
oc set env pod/ocp-probe-1-4bx8x --list
curl :@<router_IP>:<STATS_PORT>
Got some issues with getting a extenral route... In the end I rsh in to the pod and used localhost for test
If you docker registry requiere a login you can perform the following:
scp -q [email protected]:/etc/origin/master/registry.crt .
sudo cp registry.cr /etc/pki/ca-trust/source/anochors/docker-registry-default.apps.lab.example.com.crt
sudo update-ca-trust
sudo systemctl restart docker
docker-registry-cli registry.lab.example.com search metrics-cassandra ssl
If you get a tar file to import to docker (Don't ask me why you ever would do this in 2019) use the command.
docker load -i phpmyadmin-latest.tar
DO NOT USE docker import
You can get a funny error that looks something like: "Error response from daemon: No command specified." For more information look at: https://serverfault.com/questions/757210/no-command-specified-from-re-imported-docker-image-container/797619
If a k8s objects is stuck deleting the most common reason is due to finalizers. Andrew Block have written a great explination on how to debug it
Bellow you can find a simple patch that will remove finalizers from a secret. don't use unless you know what you are doing.
oc patch secret test-secret -n finalizer-example -p '{"metadata":{"finalizers":[]}}' --type=merge
Gathers logs from host and docker sosreport -k docker.all=on -k docker.logs=on
OCP4
oc adm must-gather
oc get events -n default
oc get event --sort-by='.metadata.creationTimestamp'
Remember that systemctl is running everything from kubernetes to etcd
rpm -qa |grep atomic
journalctl -u atomic-openshift-master-api.service
Good to use before oc upgrades
oc adm diagnostics
Due to reasons your cloud provider sa can be changed. To see what is currently is look at:
oc get controllerconfig machine-config-controller -o yaml
Your customer might ask does what data does insight gather from us?
INSIGHTS_OPERATOR_POD=$(oc get pods --namespace=openshift-insights -o custom-columns=:metadata.name --no-headers --field-selector=status.phase=Running) oc cp openshift-insights/$INSIGHTS_OPERATOR_POD:/var/lib/insights-operator ./insights-data
In my case my worker nodes don't get the latest cri-o image installed on it after an upgrade. So time for some debugging:
# Is all pools okay?
oc get machineconfigpool
# In my case it's in a degraded state.
# Look at the worker machineconfigpool, look under status and see what's wrong
oc get machineconfigpool worker -o yaml
# Look at the specific machineconfig that can't donsen't work.
oc get machineconfig rendered-worker-4ec48b44c2322a10cbe7cbd6ee819203 -oyaml
oc project openshift-machine-config-operator
# Find the pod that have the issue (in my case all of the workers so I start with one)
oc get pods -o wide
# logs
oc logs machine-config-daemon-7zb6v -c machine-config-daemon
I got error:
E0507 14:02:38.944558 1033717 writer.go:135] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-4ec48b44c2322a10cbe7cbd6ee819203
The following article seems to solve my issue
To view the config of the kubernetes api server, you will be able to features like feature gates.
oc get kubeapiserver cluster -o yaml
The NFS mount needs to be configured the following way:
Owned by the nfsnobody user and group.
Having rwx------ permissions (expressed as 0700 using octal).
Exported using the all_squash option.
Example /etc/exports
/var/export/vol *(rw,async,all_squash)
apiVersion: v1
kind: PersistentVolume
metadata:
name: mysql-pv
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteMany
nfs:
path: /var/export/dbvol
server: services.lab.example.com
oc adm manage-node --schedulable=false node2.lab.example.com
oc adm drain node2.lab.example.com --delete-local-data
oc adm manage-node --schedulable=true node2.lab.example.com
oc label node node2.something.internal perftest=true
If you want to add a template to your oc environment send in to open-shift namepsace. Else it won't be visiable from web-gui. oc get teamplate -n open-shift
oc apply -n openshift -f nodejs-mysql-template.yaml
Instead of importing a template you can create it directly in your project.
oc new-app -f examples/sample-app/application-template-stibuild.json
oc new-app ruby-helloworld-sample -p ADMIN_USERNAME=admin -p ADMIN_PASSWORD=mypassword
Restrict the number of resources a project is allowed to use.
oc create quota dev-quota --hard=services=10,cpu=1300m,memory=1.5Gi
oc get resourcequota
If a quota that restricts usage of compute resources for a project is set, OpenShift refuses to create pods that do not specify resource requests or resource limits for that compute resource.
To understand the difference between a limit range and a resource quota resource, consider that a limit range defines valid ranges and default values for a single pod, while a resource quota defines only top values for the sum of all pods in a project. Limites is like a resource quota but is more granual and can set max and min values.
apiVersion: "v1"
kind: "LimitRange"
metadata:
name: "dev-limits"
spec:
limits:
- type: "Pod"
max:
cpu: "2"
memory: "1Gi"
min:
cpu: "200m"
memory: "6Mi"
- type: "Container"
default:
cpu: "1"
memory: "512Mi"
spec:
limits:
- default:
cpu: 100m
memory: 100Mi
defaultRequest:
cpu: 500m
memory: 500Mi
max:
cpu: 1
memory: 1Gi
type: Container
oc get limitranges
Is the qutoa for the entire cluster and not project
oc create clusterquota env-qa --project-label-selector environment=qa --hard pods=10 --hard services=5
oc create clusterquota user-qa --project-annotation-selector openshift.io/requester=qa --hard pods=12 --hard secrets=20
If you run openshift 3.7 and want to go to 3.9 you have to land on 3.8 first.
To perform the upgrade add both 3.8 and 3.9 rhel repos.
You can define ansible hooks to run before and after you openshift upgrades.
How to call on a ansible script: openshift_master_upgrade_pre_hook=/usr/share/custom/pre_master.yml
openshift_master_upgrade_hook=/usr/share/custom/master.yml
openshift_master_upgrade_post_hook=/usr/share/custom/post_master.yml
oc get nodes
oc get -n default dc/docker-registry -o json | grep image
oc get -n default dc/router -o json | grep image
oc adm diagnostics
And verify that you have no errors.
We will see both clusterroles and roles for the user flux in this case.
kubectl get rolebinding,clusterrolebinding --all-namespaces -o jsonpath='{range .items[?(@.subjects[0].name=="flux")]}[{.roleRef.kind},{.roleRef.name}]{end}'
oc adm top node
oc adm top pod -A
ansible-playbook -i <inventory_file>
/usr/share/ansible/openshift-ansible/playbooks/openshift-checks/health.yml
Verify nodes port_range
In OCP 4 this should be defined in the tuned crd:
oc get tuned -n openshift-cluster-node-tuning-operator
for i in $(oc get nodes --no-headers -o=custom-columns=NAME:.metadata.name); do echo $i; oc debug node/$i -- chroot /host sysctl net.ipv4.ip_local_port_range; done
This is not my strong suite and but during a redhat case I picked up a few things. The certificates are X509 pem format and they don't have any encryption at all.
Using the CA file and see that the tls.crt is okay.
openssl verify -CAfile ca-bundle.crt tls.crt
openssl x509 -in broken-external.crt -text
Check to see how your ca looks like:
openssl x509 -in apps/prod/client-cert.crt -text -noout
Of course it won't be able to create a https connection to a kafka endpoint but it's a simple way to send traffic.
curl -vvv https://broker1-kafka1.domain:9093/ --cacert ca-bundle.crt --key tls.key --cert tls.crt
# How I created the secrets
# oc create secret key dosen't support a ca-bundle.crt file but only tls.crt & tls.key
oc create secret generic secret-generic3 --from-file=tls.key --from-file=tls.crt --from-file=ca-bundle.crt --from-file=ca-intermediate.crt
oc rsh <fluentd pod>
cd /var/run/ocp-collector/secrets/<path to secret>
# Use openssl to connect to the endpoint, in this case kafka.
sh-4.4# echo Q | openssl s_client -showcerts -connect broker1-kafka1.domain:9093 -servername broker1-kafka1.domain -key tls.key
-cert tls.crt -CAfile ca-bundle.crt
Or a simple openssl without certs
openssl s_client -connect ns1.domain:443
dig srv _etcd-server-ssl._tcp.
Debug
journalctl -u release-image.service
journalctl -b -f -u bootkube.service
https://docs.openshift.com/container-platform/4.3/installing/installing-gather-logs.html
sudo crictl ps
Is your storage is Fast Enough for Etcd? Here you can find a ibm blog about how to test it
Or look at this redhat doc From a master:
sudo podman run --volume /var/lib/etcd:/var/lib/etcd:Z quay.io/openshift-scale/etcd-perf
oc new-app jenkins-persistent -p ENABLE_OAUTH=false -e JENKINS_PASSWORD=openshiftpipelines -n pipeline-${GUID}-dev
The jenkins service account created in dev need access to your test and prod namespaces to be able to run CD tasks.
oc policy add-role-to-user edit system:serviceaccount:pipeline-${GUID}-dev:jenkins -n pipeline-${GUID}-prod
oc policy add-role-to-group system:image-puller system:serviceaccounts:pipeline-${GUID}-prod -n pipeline-${GUID}-dev oc policy add-role-to-user system:image-puller
oc policy add-role-to-user edit system:serviceaccount:tekton -n python-example-dev oc policy add-role-to-group system:image-puller system:serviceaccounts:python-example-build:tekton -n python-example-dev oc policy add-role-to-user system:image-puller
View machines currently in your cluster
oc get machines -n openshift-machine-api
A machineset is a generic defenition on how a worker node should look. Like defining if a GPU should be avaliable or in which reagion of a cloud provider it should recide.
oc get machinesets -n openshift-machine-api
oc logs
$(oc -n openshift-machine-api get pods -l k8s-app=controller --output=jsonpath={.items..metadata.name})
-c machine-controller -n openshift-machine-api -f
oc scale machineset cluster-ba50-z759k-worker-us-east-2c --replicas=1 -n openshift-machine-api
The Cluster Network Operator (CNO) deploys and manages the cluster network components on an OpenShift Container Platform cluster, including the Container Network Interface (CNI) Software Defined Networking (SDN) plug-in selected for the cluster during installation.
oc get -n openshift-network-operator deployment/network-operator
oc get clusteroperator/network
The SDN is built up on deamonsets that is run on each server and a sdn-controller that seems to be deployed to each master.
oc project -n openshift-sdn
https://suraj.pro/post/user-on-os4/
htpasswd -cb users.htpasswd user1 user1pass
As admin go to -> Administration > Cluster Settings > Global Configuration -> Oauth
Overview, under Identity Providers section, Click on Add and select HTPasswd and upload your file.
oc login -u user1 -p user1pass
oc get clusteroperators
master = coreos
worker/infra can be both rhel 7 or coreos. Coreos recomended. The biggest reason why people want to run rhel 7 is that they want to install a bunch of crap on the server like secuirty scanning and other tools.
1000 node cluster is enough to use 3 masters in most cases, you can increase to 5 masters but normally not needed. 7 is to much due to etcd corum stuff.
Not worth putting quay as the internal OCP registry, use something like nexus instead.
OCP looks for ingress objects, when created the "route" operator grabs it and translates it in to a route object. This way you can use helm charts that is written for k8s.
AWS EBS can only be in a single avalability zone, in a IPI installation we by default set up nodes in zone A, B and C this will create issues if a zone goes down since it's read write once. This makes it that you need recreate as your rollout startegy. In short you can't use PVC in AWS in a good way unless you want to do some black magic for your PVC.
With the help of an operator you can get labels automatically on nodes.
This is called: node-feature-operator
When defining a nodeselector for a crd in openshift you can't just use nodeselector. Instead you nede to define nodeplacment that calls on nodeselector. This is good to know when you are configuering stuff to only run on infra node for example.
oc patch ingresscontroller/default -n openshift-ingress-operator --type=merge --patch '{"spec": {"nodePlacement": {"nodeSelector": {"matchLabels": {"node-role.kubernetes.io/infra":""}}}}}'
Verify pods on the infra nodes:
oc get pod -n openshift-ingress -o wide
spec:
nodePlacement:
nodeSelector:
matchLabels:
node-role.kubernetes.io/infra: ""
Example patch command:
oc patch configs.imageregistry.operator.openshift.io/cluster -n openshift-image-registry --type=merge --patch '{"spec":{"nodeSelector":{"node-role.kubernetes.io/infra":""}}}'
oc get pods -n openshift-image-registry -o wide
The "mother operator" this is the operator that contains all versions of the rest of the operators. So more or less call this one to create the rest.
CIDR is the internal network and the range defined is which ip range the pods will get. This can't be changed after the installation.
ServiceNetwork defines the range where your services will be avliable.
You can do a bunch of things through the GUI but why would you want to do that?
oc get packagemanifests -n openshift-marketplace
This will give you information of how to install the operator
oc describe packagemanifests nfd -n openshift-marketplace
Example on how to create a subscription
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: nfd
namespace: openshift-operators
spec:
channel: "4.2"
installPlanApproval: Automatic
name: nfd
source: redhat-operators
sourceNamespace: openshift-marketplace
Don't forget to create the crd
apiVersion: nfd.openshift.io/v1alpha1
kind: NodeFeatureDiscovery
metadata:
name: nfd-master-server
namespace: openshift-operators
spec:
namespace: openshift-nfd
oc get csv | grep mesh
Find out what recources an operator provides/listens to.
oc get csv servicemeshoperator.v1.0.2 -o json | jq '.spec.customresourcedefinitions.owned[].kind'
oc get csv servicemeshoperator.v1.0.2 -o json | jq '.spec.customresourcedefinitions.required[].kind'
https://openshift.tips/machine-config/
For every machineconfig you update all the nodes will be restarted. To hinder this perfrom:
oc patch --type=merge --patch='{"spec":{"paused":true}}' machineconfigpool/master
Who needs it? No one ever runs storage stuff and none gitops in k8s right? :)
A easy way to setup backups of etcd.
https://github.com/sushilsuresh/ocp4-ansible-roles/tree/master/roles/etcd-backup
Sadly this isn't enough if you wan't to backup the data in your pvc. In enters velero
Have written velero.md on how to use it.
To make sure that you image registry don't become ful you should delete some old images that is saved in the internal registry.
oc get imagepruner cluster -o yaml
The automatic feature was enabled in OCP 4.4
Due to some bugs in OCP 4.2 you coulden't always go onwards from OCP 4.2 to next minor release. The patch/upgrade path is described here
There is a simple graphical tool that you can use but you need to download a few tools.
sudo dnf install graphviz
wget https://raw.githubusercontent.com/openshift/cincinnati/master/hack/graph.sh
chmod 755 graph.sh
curl -sH 'Accept:application/json' 'https://api.openshift.com/api/upgrades_info/v1/graph?channel=stable-4.4&arch=amd64' | ./graph.sh | dot -Tsvg > graph.svg
When using the UI or cli upgrading the OCP 4.3 up until 4.3.13 you can get funny errors about scc issues. But overall check the clusterversion and you will see the errors message.
oc get clusterversion -o yaml
In there I found the following reason:
message: 'Precondition "ClusterVersionUpgradeable" failed because of "DefaultSecurityContextConstraints_Mutated": Cluster operator kube-apiserver cannot be upgraded: DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid]' reason: UpgradePreconditionCheckFailed status: "True" type: Failing
So how do I know what have changed in anyuid from the default?
This dosen't have anything to do with OCP but this is currently my favorit file to write nice to have stuff.
ERRO[0000] Error refreshing container f7db993e6fa423475035277f88cc09f0154dee13b257914719c18c8e62639002: error acquiring lock 0 for container f7db993e6fa423475035277f88cc09f0154dee13b257914719c18c8e62639002: file exists
rm -rf ~/.local/share/containers/storage/overlay-containers//userdata/
Good to have commands
Coulden't find the way to get jq to ignore the . in tls.key
k get secret secret-name -o jsonpath={.data."tls.crt"} |base64 -d > tls.crt
k get secret secret-name -o jsonpath={.data."tls.key"} |base64 -d > tls.key