Chaos Engineering with Litmus

Litmus is a tool to practice chaos engineering in a kubernetes native way. Litmus provides chaos specific CRDs for Cloud-Native developers and SREs to inject, orchestrate and monitor chaos to find weaknesses in Kubernetes deployments.

In this section, an experiment that can inject chaos into the OpenEBS cStor volume and Application using a Litmus chart is mentioned. This way a user can validate the resiliency of the application by injecting chaos. There are multiple OpenEBS experiments available in chart hub which can be used to check resiliency of the application. To understand better, more details can be found here.

Installation & Setup

Install Litmus

Installation of litmus can be done by executing the following command:

  1. kubectl apply -f https://litmuschaos.github.io/pages/litmus-operator-v1.1.0.yaml

Verify if the chaos operator is running using the following command:

  1. kubectl get pods -n litmus

The following is a sample output:

chaos-operator-ce-554d6c8f9f-slc8k 1/1 Running 0 6m41s

Verify if chaos CRDs are installed using the following command:

  1. kubectl get crds | grep chaos

chaosengines.litmuschaos.io 2019-10-02T08:45:25Z chaosexperiments.litmuschaos.io 2019-10-02T08:45:26Z chaosresults.litmuschaos.io 2019-10-02T08:45:26Z

Install OpenEBS chaos experiments Custom Resources

To create OpenEBS chaos experiments CRs, execute the following command:

  1. kubectl create -f https://hub.litmuschaos.io/api/chaos?file=charts/openebs/experiments.yaml -n <application_namespace>

Verify if the chaos experiments are installed using the following command:

  1. kubectl get chaosexperiments -n <application_namespace>

The output will be similar to the following:

NAME AGE openebs-pool-container-failure 1h openebs-pool-pod-failure 1h openebs-target-container-failure 1h openebs-target-network-delay 1h openebs-target-network-loss 1h openebs-target-pod-failure 1h

cStor Volume Chaos Experiments

cStor Target Container Failure

Setup Service Account

A Service Account should be created to allow chaos engine to run experiments in the application namespace. Copy the following YAML spec into rbac-chaos.yaml. You can change the service account name and namespace as needed.

  1. ---
  2. apiVersion: v1
  3. kind: ServiceAccount
  4. metadata:
  5. name: mysql-chaos
  6. # app namespace
  7. namespace: default
  8. labels:
  9. app: mysql
  10. ---
  11. kind: ClusterRole
  12. apiVersion: rbac.authorization.k8s.io/v1
  13. metadata:
  14. name: mysql-chaos
  15. rules:
  16. - apiGroups: ["", "extensions", "apps", "batch", "litmuschaos.io", "openebs.io", "storage.k8s.io"]
  17. resources: ["daemonsets", "deployments", "replicasets", "jobs", "pods", "pods/exec","nodes","events", "chaosengines", "chaosexperiments", "chaosresults", "storageclasses", "persistentvolumes", "persistentvolumeclaims", "services", "cstorvolumereplicas", "configmaps"]
  18. verbs: ["*"]
  19. ---
  20. kind: ClusterRoleBinding
  21. apiVersion: rbac.authorization.k8s.io/v1
  22. metadata:
  23. name: mysql-chaos
  24. subjects:
  25. - kind: ServiceAccount
  26. name: mysql-chaos
  27. namespace: default
  28. roleRef:
  29. kind: ClusterRole
  30. name: mysql-chaos
  31. apiGroup: rbac.authorization.k8s.io

Apply the following command to create one such account on your provided namespace. In this example, namespace is mentioned as default.

  1. kubectl apply -f rbac-chaos.yaml

Annotate your application

Your application has to be annotated with litmuschaos.io/chaos="true". As a security measure, Chaos Operator checks for this annotation on the application before invoking chaos experiment(s) on the application.

  1. kubectl annotate deploy/<deployment_name> -n <application_namespace> litmuschaos.io/chaos="true"

Example command:

  1. kubectl annotate deploy/mysql -n default litmuschaos.io/chaos="true"

NOTE: To get the deployment name, run kubectl get deploy -n <application_namespace>

Prepare and Run ChaosEngine

ChaosEngine connects application to the Chaos Experiment. Prepare the chaos engine template to inject container failure on the OpenEBS cStor target pod.

Copy the following YAML spec into chaosengine.yaml.

  1. apiVersion: litmuschaos.io/v1alpha1
  2. kind: ChaosEngine
  3. metadata:
  4. name: target-chaos
  5. namespace: default
  6. spec:
  7. appinfo:
  8. # App namespace
  9. appns: default
  10. applabel: 'app=mysql'
  11. appkind: deployment
  12. chaosServiceAccount: mysql-chaos
  13. monitoring: false
  14. jobCleanUpPolicy: delete
  15. experiments:
  16. - name: openebs-target-container-failure
  17. spec:
  18. components:
  19. - name: TARGET_CONTAINER
  20. value: 'cstor-istgt'
  21. - name: APP_PVC
  22. value: 'mysql-claim'
  23. - name: DEPLOY_TYPE
  24. value: deployment

Update the following parameters in the above chaos engine template with the details of PVC whose corresponding target container has to be killed.

  • spec.appinfo.appns :- Namespace where the application is deployed.
  • spec.appinfo.applabel :- Any one of the label of application pod. Run kubectl get pod <appliction_pod_name> --show-labels to get the labels.
  • spec.appinfo.appkind :- Type of application such as Deployment or StatefulSet.
  • spec.chaosServiceAccount :- Name of Service Account created in setup service account section.
  • spec.experiments.spec.components :- Update value for APP_PVC with the application PVC name and value for DEPLOY_TYPE as the type of application such as Deployment or StatefulSet.

After updating the above details in chaos engine template, run the following command to run the openebs-target-container-kill chaos experiment.

  1. kubectl create -f chaosengine.yaml

NOTE: It is recommended to create Application, ChaosEngine, ChaosExperiment and Service Account in the same namespace for smooth execution of experiments.

A chaos experiment job is launched that carries out the intended chaos. It may take some time to start the job. Check if the job is completed by executing the following command:

  1. kubectl get jobs -n <application-namespace> | grep <experiment-name>

Run the following command to check the staus of the pod created by the above job:

  1. kubectl get pods -n <application-namespace> | grep <experiment-name>

Observe Chaos results

Run the following command to get the name of chaos experiment result:

  1. kubectl get chaosresult

Example output:

  1. NAME AGE
  2. target-chaos-openebs-target-container-failure 5m

The name of chaosresult will be created in this format - <chaosengine name>-<chaos-experiment name>.

After completion of chaos experiment job, verify if the application deployment is resilient to momentary loss of the storage target by describing the chaosresult through the following command.

  1. kubectl describe chaosresult <chaos_result_name> -n <application_namespace>

Example command:

  1. kubectl describe chaosresult target-chaos-openebs-target-container-failure -n default

The spec.verdict is set to Running when the experiment is in progress, eventually changing to either Pass or Fail. A Pass means the application is resilient against the injected failures. A Fail means the application could not sustain injected failures.

You can ensure the resiliency of cStor volume by checking if the target pod is healthy and running successfully. This can be checked by running following command:

  1. kubectl get pod -n openebs | grep <PV_name>

Go to top