Create Chaos Mesh Workflow

Introduction to Chaos Mesh Workflow

When you use Chaos Mesh to simulate real system faults, continuous validation is always a need. You might want to build a series of faults on the Chaos Mesh platform, instead of performing individual Chaos injections.

To meet this need, Chaos Mesh provided Chaos Mesh Workflow, a built-in workflow engine. Using this engine, you can run different Chaos experiments in serial or parallel to simulate production-level errors.

Currently, Chaos Mesh Workflow supports the following features:

  • Serial Orchestration
  • Parallel Orchestration
  • Customized tasks
  • Conditional branch

Typical user scenarios:

  • Use parallel orchestration to inject multiple NetworkChaos faults to simulate complex web environments.
  • Use serial orchestration to perform health checks and use the conditional branch to determine whether to perform the remaining steps.

The design of Chaos Mesh Workflow is, to some extent, inspired by Argo Workflows. If you are familiar with Argo Workflows, you can also quickly get started with Chaos Mesh Workflow.

More workflow examples are available in the Chaos Mesh GitHub repository.

Create a workflow using Chaos Dashboard

Step 1. Open Chaos Dashboard

Click NEW WORKFLOW.

New Workflow

Step 2. Setup basic information of the workflow

Workflow Info

Step 3. Configure the nodes of the workflow

  1. Select an option under Choose task type according to your needs.

    In this example, the “Single” type is selected as the task type.

    Create Chaos Mesh Workflow - 图3note

    Chaos Dashboard automatically creates a serial node named “entry” as the entry point for this workflow.

    Choose Task Type

  2. Fill out the experiment information.

    The configuration method is the same as creating a normal chaos experiment. For example, you can set up a “POD KILL” type of “PodChaos” named kill-nginx.

    Create podkill in Workflow

Step 4. Submit the workflow

You can check workflow definition through Preview, and then click the SUBMIT WORKFLOW to create the workflow.

Submit Workflow

Create a workflow using a YAML file and kubectl

Similar to various types of Chaos objects, workflows also exist in a Kubernetes cluster as a CRD. You can create a Chaos Mesh workflow using kubectl create -f <workflow.yaml>. The following command is an example of creating a workflow. Create a workflow using a local YAML file:

  1. kubectl create -f <workflow.yaml>

Create a workflow using a YAML file from the network:

  1. kubectl create -f https://raw.githubusercontent.com/chaos-mesh/chaos-mesh/master/examples/workflow/serial.yaml

A simple workflow YAML file is defined as follows. In this workflow, StressChaos, NetworkChaos, and PodChaos are injected:

  1. apiVersion: chaos-mesh.org/v1alpha1
  2. kind: Workflow
  3. metadata:
  4. name: try-workflow-parallel
  5. spec:
  6. entry: the-entry
  7. templates:
  8. - name: the-entry
  9. templateType: Parallel
  10. deadline: 240s
  11. children:
  12. - workflow-stress-chaos
  13. - workflow-network-chaos
  14. - workflow-pod-chaos-schedule
  15. - name: workflow-network-chaos
  16. templateType: NetworkChaos
  17. deadline: 20s
  18. networkChaos:
  19. direction: to
  20. action: delay
  21. mode: all
  22. selector:
  23. labelSelectors:
  24. 'app': 'hello-kubernetes'
  25. delay:
  26. latency: '90ms'
  27. correlation: '25'
  28. jitter: '90ms'
  29. - name: workflow-pod-chaos-schedule
  30. templateType: Schedule
  31. deadline: 40s
  32. schedule:
  33. schedule: '@every 2s'
  34. type: 'PodChaos'
  35. podChaos:
  36. action: pod-kill
  37. mode: one
  38. selector:
  39. labelSelectors:
  40. 'app': 'hello-kubernetes'
  41. - name: workflow-stress-chaos
  42. templateType: StressChaos
  43. deadline: 20s
  44. stressChaos:
  45. mode: one
  46. selector:
  47. labelSelectors:
  48. 'app': 'hello-kubernetes'
  49. stressors:
  50. cpu:
  51. workers: 1
  52. load: 20
  53. options: ['--cpu 1', '--timeout 600']

In the above YAML template, the templates fields define the steps of the experiment. The entry field defines the entry of the workflow when the workflow is being executed.

Each element in templates represents a workflow step. For example:

  1. name: the-entry
  2. templateType: Parallel
  3. deadline: 240s
  4. children:
  5. - workflow-stress-chaos
  6. - workflow-network-chaos
  7. - workflow-pod-chaos

templateType: Parallel means that the node type is parallel. deadline: 240s means that all parallel experiments on this node are expected to be performed in 240 seconds; otherwise, the experiments time out. children means the other template names to be executed in parallel.

For example:

  1. name: workflow-pod-chaos
  2. templateType: PodChaos
  3. deadline: 40s
  4. podChaos:
  5. action: pod-kill
  6. mode: one
  7. selector:
  8. labelSelectors:
  9. 'app': 'hello-kubernetes'

templateType: PodChaos means that the node type is PodChaos experiments. deadline: 40s means that the current Chaos experiment lasts for 40 seconds. podChaos is the definition of the PodChaos experiment.

It is flexible to create a workflow using a YAML file and kubectl. You can nest parallel or serial orchestrations to declare complex orchestrations, and even combine the orchestration with conditional branches to achieve a circular effect.

Field description

Workflow field description

ParameterTypeDescriptionDefault valueRequiredExample
entrystringDeclares the entry of the workflow. Its value is a name of a template.NoneYes
templates[]TemplateDeclares the behavior of each step executable in the workflow. See Template field description for details.NoneYes

Template field description

ParameterTypeDescriptionDefault valueRequiredExample
namestringThe name of the template, which needs to meet the DNS-1123 requirements.NoneYesany-name
typestringType of template. Value options are Task, Serial, Parallel, Suspend, Schedule, AWSChaos, DNSChaos, GCPChaos, HTTPChaos, IOChaos, JVMChaos, KernelChaos, NetworkChaos, PodChaos, StressChaos, and TimeChaos, StatusCheck.NoneYesPodChaos
deadlinestringThe duration of the template.NoneNo‘5m30s’
children[]stringDeclares the subtasks under this template. You need to configure this field when the type is Serial or Parallel.NoneNo[“any-chaos-1”, “another-serial-2”, “any-shcedue”]
taskTaskConfigures the customized task. You need to configure this field when the type is Task. See the Task field description for details.NoneNo
conditionalBranches[]ConditionalBranchConfigures the conditional branch which executes after customized task. You need to configure this field when the type is Task. See the Conditional branch field description for details.NoneNo
awsChaosobjectConfigures AWSChaos. You need to configure this field when the type is AWSChaos. See the Simulate AWS Faults document for details.NoneNo
dnsChaosobjectConfigures DNSChaos. You need to configure this field when the type is DNSChaos. See the Simulate DNS Faults document for details.NoneNo
gcpChaosobjectConfigures GCPChaos. You need to configure this field when the type is GCPChaos.See the Simulation GCP Faults document for details.NoneNo
httpChaosobjectConfigures HTTPChaos. You need to configure this field when the type is HTTPChaos. See the Simulate HTTP Faults document for details.NoneNo
ioChaosobjectConfigure IOChaos. You need to configure this field when the type is IOChaos. See the Simulate File I/O Faults document for details.NoneNo
jvmChaosobjectConfigures JVMChaos. You need to configure this field when the type is JVMChaos. See the Simulate JVM Application Faults document for details.NoneNo
kernelChaosobjectConfigure KernelChaos. You need to configure this field when the type is KernelChaos. See the Simulate Kernel Faults document for details.NoneNo
networkChaosobjectConfigures NetworkChaos. You need to configure this field when the type is NetworkChaos. See the Simulate AWS Faults document for details.NoneNo
podChaosobjectConfigures PodChaosd. You need to configure this field when the type is PodChaosd. See the Simulate Network Faults document for details.NoneNo
stressChaosobjectConfigures StressChaos. You need to configure this field when the type is StressChaos. See the Simulate Heavy Stress on Kubernetes document for details.NoneNo
timeChaosobjectConfigures TimeChaos. You need to configure this field when the type is TimeChaos. See the SImulate Time Faults document for details.NoneNo
scheduleobjectConfigures Schedule. You need to configure this field when the type is Schedule. See the Define Scheduling Rules document for details.NoneNo
statusCheckobjectConfigures StatusCheck. You need to configure this field when the type is StatusCheck. See the StatusCheck in Workflow document for details.NoneNo
abortWithStatusCheckboolConfigures whether abort the Workflow when StatusCheck is failed. You can configure this field when the type is StatusCheck.falseNotrue

Create Chaos Mesh Workflow - 图7note

When creating a Chaos with a duration in the workflow, you need to fill the duration in the outer deadline field instead of using the duration field in Chaos.

Task field description

ParameterTypeDescriptionDefault valueRequiredExample
containerobjectDefines a customized task container. See Container field description for details.NoneNo
volumesarrayIf you need to mount a volume in a customized task container, you need to declare the volume in this field. For the detailed definition of a volume, see the Kubernetes documentation - corev1.Volume.NoneNo

Conditional branch field description

ParameterTypeDescriptionDefault valueRequiredExample
targetstringThe name of the template to be executed by the current conditional branch.NoneYesanother-chaos
expressionstringThe type is a boolean expression. When a customized task is completed and the expression value is true, the current condition branch is executed. When this value is not set, the conditional branch will be executed directly after the customized task is completed.NoneNoexitCode == 0

Currently, two context variables are provided in expression:

  • exitCode means the exit code for a customized task.
  • stdout indicates the standard output for a customized task.

More context variables will be added in later releases.

Refer to this document write expression expressions.

Container field description

The following table only lists the commonly used fields. For the definitions of more fields, see Kubernetes documentation - core1.Container.

ParameterTypeDescriptionDefault valueRequiredExample
namestringContainer nameNoneYestask
imagestringImage nameNoneYesbusybox:latest
command[]stringContainer commandsNoneNo[“wget”, “-q”, “http://httpbin.org/status/201“]