Skip to content

Commit

Permalink
[Cherry Pick for 1.13.3] (#25)
Browse files Browse the repository at this point in the history
* Fix: k8s module to create chaosengine event (#22)

* Fix: k8s module to create chaosengine event

Signed-off-by: udit <[email protected]>

* chore (experiment) Adding vmware vm-poweroff-experiment (#24)

* vm-delete-experiments-added

Signed-off-by: Ubuntu <[email protected]>

* Removing chaos Interval

Signed-off-by: Delphine Joyneer <[email protected]>

* Update README.md

* Update README.md

Co-authored-by: Ubuntu <[email protected]>
Co-authored-by: Delphine Joyneer <[email protected]>
Co-authored-by: Udit Gaurav <[email protected]>

Co-authored-by: iassurewipro <[email protected]>
Co-authored-by: Ubuntu <[email protected]>
Co-authored-by: Delphine Joyneer <[email protected]>
  • Loading branch information
4 people authored Apr 15, 2021
1 parent 0cddede commit 621671a
Show file tree
Hide file tree
Showing 12 changed files with 618 additions and 39 deletions.
10 changes: 3 additions & 7 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,15 @@ dist: trusty
install: true
language: go
go:
- 1.11.2
- 1.14
env:
global:
- GOARCH=amd64
- TRIVYARCH="64bit"

before_install:
- sleep 15
- sudo apt-get install curl
- export VERSION=$(curl --silent "https://api.github.com/repos/aquasecurity/trivy/releases/latest" | grep '"tag_name":' | sed -E 's/.*"v([^"]+)".*/\1/')
- echo ${VERSION}
- wget https://github.com/aquasecurity/trivy/releases/download/v${VERSION}/trivy_${VERSION}_Linux-${TRIVYARCH}.tar.gz
- tar zxvf trivy_${VERSION}_Linux-${TRIVYARCH}.tar.gz
- wget https://github.com/aquasecurity/trivy/releases/download/v0.11.0/trivy_0.11.0_Linux-64bit.tar.gz
- tar zxvf trivy_0.11.0_Linux-64bit.tar.gz

script:
- make deps
Expand Down
11 changes: 8 additions & 3 deletions build/ansible-runner/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,18 @@ LABEL maintainer="LitmusChaos"
#Installing necessary ubuntu packages
RUN apt-get clean && \
apt-get update --fix-missing || true && \
apt-get install -y --no-install-recommends python3-minimal python3-pip netcat iproute2 jq sshpass \
apt-get install -y --no-install-recommends python3-minimal python3-pip netcat iproute2 jq sshpass vim \
curl openssh-client python3-setuptools && rm -rf /var/lib/apt/lists/*

RUN touch /mnt/parameters.yml


#Installing ansible and dependencies for k8s module
RUN pip3 install ansible==2.7.3 openshift jmespath boto boto3 ansible-lint
RUN pip3 install cryptography==3.3.1 ansible==2.10 openshift jmespath boto boto3
RUN pip3 install ansible-lint==4.3.7
RUN ansible-galaxy collection install community.kubernetes

RUN file="$(ansible-lint --version)" && echo $file
RUN file="$(pip3 --version)" && echo $file

#Installing gcloud
RUN curl -sSL https://sdk.cloud.google.com | bash
Expand Down
86 changes: 86 additions & 0 deletions chaoslib/vmware/inject_vm_failure_vm_delete.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
---
- block:

- block:

- debug:
msg: "***** Waiting for the ramp interval of {{ ramp_time }}s *****"

- name: "[Ramp]: Waiting for the specified ramp time before injecting chaos"
wait_for: timeout="{{ ramp_time }}"

when: "ramp_time is defined and ramp_time != ''"

- block:
- name: "[Prepare]: Generate a run id if not passed from the engine/experiment"
shell: echo $(mktemp) | cut -d '.' -f 2 | cut -c -6
register: rand_string

- set_fact:
run_id: "{{ rand_string.stdout | lower }}"
when: "run_id is not defined or run_id == ''"

- name: Getting the serviceAccountName
shell: >
kubectl get pod {{ chaos_pod_name }} -n {{ c_ns }} -o
custom-columns=:.spec.serviceAccountName --no-headers
args:
executable: /bin/bash
register: chaos_service_account

- name: "[Prepare]: Set the state of vm to stop"
uri:
url: https://{{ vcenter_server }}/rest/vcenter/vm/{{ vm_moid1 }}/power/stop
force_basic_auth: yes
validate_certs: no
method: POST
headers:
Cookie: "{{ login.set_cookie }}"
register: result1

- debug:
msg: "VM is powered-off"

- name: "[TimeOut]: Timeout for 1 minute"
pause:
minutes: 1

- name: "[Prepare]: Set the state of vm to start"
uri:
url: https://{{ vcenter_server }}/rest/vcenter/vm/{{ vm_moid1 }}/power/start
force_basic_auth: yes
validate_certs: no
method: POST
headers:
Cookie: "{{ login.set_cookie }}"
register: result1

- name: "[Status]: Get the current status of vm"
uri:
url: https://{{ vcenter_server }}/rest/vcenter/vm/{{ vm_moid1 }}/power/
force_basic_auth: yes
validate_certs: no
headers:
Cookie: "{{ login.set_cookie }}"
register: result
until: "'{{ result.json.value.state }}' == 'POWERED_ON'"
delay: 2
retries: 5

- debug:
msg: "The state of vm is {{ result.json.value.state }}"

- fail:
msg: "vm-delete-chaos pod failed"
when: "result.json.value.state == 'POWERED_OFF'"


- block:
- debug:
msg: "***** Waiting for the ramp interval of {{ ramp_time }}s *****"

- name: "[Ramp]: Waiting for the specified ramp time after injecting chaos"
wait_for: timeout="{{ ramp_time }}"

when: "ramp_time is defined and ramp_time != ''"

229 changes: 229 additions & 0 deletions experiments/vmware/vm-delete/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,229 @@
## Experiment Metadata

<table>
<tr>
<th> Type </th>
<th> Description </th>
<th> Tested K8s Platform </th>
</tr>
<tr>
<td> VMWare </td>
<td> Stopping a VM for a certain chaos duration</td>
<td> EKS </td>
</tr>
</table>


## Prerequisites

- Ensure that Kubernetes Version > 1.13
- Ensure that the Litmus Chaos Operator is running by executing `kubectl get pods` in operator namespace (typically, `litmus`). If not, install from [here](https://docs.litmuschaos.io/docs/getstarted/#install-litmus)
- Ensure that the `vm-delete` experiment resource is available in the cluster by executing `kubectl get chaosexperiments` in the desired namespace If not, install from [here](https://hub.litmuschaos.io/api/chaos/master?file=charts/vmware/vm-delete/experiment.yaml)
- Ensure that you have sufficient Vcenter access to stop and start the vm.
- Ensure to create a Kubernetes secret having the Vcenter credentials in the `CHAOS_NAMESPACE`. A sample secret file looks like:

```yaml
apiVersion: v1
kind: Secret
metadata:
name: vcenter-secret
namespace: litmus
type: Opaque
stringData:
VCENTERSERVER: XXXXXXXXXXX
VCENTERUSER: XXXXXXXXXXXXX
VCENTERPASS: XXXXXXXXXXXXX
```
## Entry-Criteria
- vm is healthy before chaos injection.
## Exit-Criteria
- vm is healthy post chaos injection.
## Details
- Experiment uses vmware api's to start/stop the vm.
- Stops a VM before bringing it back to running state after the specified chaos duration.
- It helps to check the performance of the application/process running on the vmware server.
## Integrations
- vm-delete can be effected using the chaos library: `litmus`, which makes use of vmware api's to start/stop a vm in vmware environment.
- The desired chaoslib can be selected by setting the above options as value for the env variable `LIB`

## Steps to Execute the Chaos Experiment

- This Chaos Experiment can be triggered by creating a ChaosEngine resource on the cluster. To understand the values to provide in a ChaosEngine specification, refer [Getting Started](getstarted.md/#prepare-chaosengine)

- Follow the steps in the sections below to create the chaosServiceAccount, prepare the ChaosEngine & execute the experiment.

### Prepare chaosServiceAccount

- Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

#### Sample Rbac Manifest

[embedmd]:# (https://raw.githubusercontent.com/litmuschaos/chaos-charts/master/charts/vmware/vm-delete/rbac.yaml yaml)
```yaml
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: vm-delete-sa
namespace: default
labels:
name: vm-delete-sa
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
name: vm-delete-sa
namespace: default
labels:
name: vm-delete-sa
rules: [{'apiGroups': ['', 'batch', 'litmuschaos.io'], 'resources': ['jobs', 'pods', 'deployments','pods/log', 'events', 'chaosengines', 'chaosexperiments', 'chaosresults', 'secrets'], 'verbs': ['create', 'list', 'get', 'update', 'patch', 'delete','deletecollection']}]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: vm-delete-sa
namespace: default
labels:
name: vm-delete-sa
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: vm-delete-sa
subjects:
- kind: ServiceAccount
name: vm-delete-sa
namespace: default
```

### Prepare ChaosEngine

- Provide the application info in `spec.appinfo`
- Provide the auxiliary applications info (ns & labels) in `spec.auxiliaryAppInfo`
- Override the experiment tunables if desired in `experiments.spec.components.env`
- To understand the values to provided in a ChaosEngine specification, refer [ChaosEngine Concepts](chaosengine-concepts.md)

#### Supported Experiment Tunables

<table>
<tr>
<th> Variables </th>
<th> Description </th>
<th> Specify In ChaosEngine </th>
<th> Notes </th>
</tr>
<tr>
<td> APP_VM_MOID </td>
<td> Moid of the vmware instance</td>
<td> Required </td>
<td> </td>
</tr>
<tr>
<td> TOTAL_CHAOS_DURATION </td>
<td> The time duration for chaos insertion (sec) </td>
<td> Optional </td>
<td> Defaults to 30s </td>
</tr>
<tr>
<td> VCENTERSERVER </td>
<td> IP Address of the vcenter </td>
<td> Required </td>
<td> Should be specified in secret created </td>
</tr>
<tr>
<td> VCENTERUSER </td>
<td> Username of Vcenter </td>
<td> Required </td>
<td> Should be specified in secret created having access to start/stop the vm </td>
</tr>
<tr>
<td> VCENTERPASS </td>
<td> Password of Vcenter </td>
<td> Required </td>
<td> Should be specified in secret created </td>
</tr>
</table>

#### Sample ChaosEngine Manifest

[embedmd]:# (https://raw.githubusercontent.com/litmuschaos/chaos-charts/master/charts/vmware/vm-delete/engine.yaml yaml)
```yaml
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: nginx-chaos
namespace: default
spec:
# It can be true/false
annotationCheck: 'false'
# It can be active/stop
engineState: 'active'
#ex. values: ns1:name=percona,ns2:run=nginx
auxiliaryAppInfo: ''
chaosServiceAccount: vm-delete-sa
monitoring: false
# It can be delete/retain
jobCleanUpPolicy: 'retain'
experiments:
- name: vm-delete
spec:
components:
env:
# set chaos duration (in sec) as desired
- name: TOTAL_CHAOS_DURATION
value: '60'
# provide the kill count
- name: APP_VM_MOID
value: ''
- name: VCENTERSERVER
valueFrom:
secretKeyRef:
name: vcenter-secret
key: VCENTERSERVER
- name: VCENTERUSER
valueFrom:
secretKeyRef:
name: vcenter-secret
key: VCENTERUSER
- name: VCENTERPASS
valueFrom:
secretKeyRef:
name: vcenter-secret
key: VCENTERPASS
```

### Create the ChaosEngine Resource

- Create the ChaosEngine manifest prepared in the previous step to trigger the Chaos.

`kubectl apply -f chaosengine.yml`

- If the chaos experiment is not executed, refer to the [troubleshooting](https://docs.litmuschaos.io/docs/faq-troubleshooting/)
section to identify the root cause and fix the issues.

### Watch Chaos progress

- You can monitor vcenter console to keep a watch over the vm state.

### Check Chaos Experiment Result

- Check whether the application is resilient to the vm-delete, once the experiment (job) is completed. The ChaosResult resource name is derived like this: `<ChaosEngine-Name>-<ChaosExperiment-Name>`.

`kubectl describe chaosresult nginx-chaos-vm-delete -n <application-namespace>`

### vm-delete Experiment Demo

- A sample recording of this experiment execution will be added soon.
3 changes: 3 additions & 0 deletions experiments/vmware/vm-delete/chaosutil.j2
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{% if c_lib is defined and c_lib == 'litmus' %}
c_util: "/chaoslib/vmware/inject_vm_failure_vm_delete.yml"
{% endif %}
23 changes: 23 additions & 0 deletions experiments/vmware/vm-delete/vm-delete.chartserviceversion.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
apiVersion: litmuchaos.io/v1alpha1
kind: ChartServiceVersion
metadata:
name: vm-delete
version: 0.1.0
annotations:
categories: vmware
repository: https://github.com/litmuschaos/litmus-ansible/tree/master/vmware/vm-delete
support: https://kubernetes.slack.com/messages/CNXNB0ZTN
spec:
displayName: vm-delete
categoryDescription: >
stops a vm for a certain chaos duration
keywords: ['pods', 'kubernetes', 'vmware','vm-delete', 'nginx']
maturity: alpha
minKubeVersion: 1.12.0
provider: {'name': 'Wipro'}
maintainers: ['[email protected]']
contributors: ['[email protected]']
links: ['https://docs.litmuschaos.io/docs/getstarted/']
icon:
- url:
mediatype: ""
Loading

0 comments on commit 621671a

Please sign in to comment.