Simulate AWS Faults

This document describes how to use Chaos Mesh to simulate AWS faults.

AWSChaos introduction

AWSChaos can help you simulate fault scenarios on the specified AWS instance. Currently, AWSChaos supports the following fault types:

  • EC2 Stop: stops the specified EC2 instance.
  • EC2 Restart: restarts the specified EC2 instance.
  • Detach Volume: uninstalls the storage volume from the specified EC2 instance.

Secret file

To easily connect to the AWS cluster, you can create a Kubernetes Secret file to store the authentication information in advance.

A Secret file sample is as follows:

  1. apiVersion: v1
  2. kind: Secret
  3. metadata:
  4. name: cloud-key-secret
  5. namespace: chaos-testing
  6. type: Opaque
  7. stringData:
  8. aws_access_key_id: your-aws-access-key-id
  9. aws_secret_access_key: your-aws-secret-access-key
  • name means the Kubernetes Secret object.
  • namespace means the namespace of the Kubernetes Secret object.
  • aws_access_key_id stores the ID of the access key to the AWS cluster.
  • aws_secret_access_key stores the secrete access key to the AWS cluster.

Create experiments using Chaos Dashboard

Simulate AWS Faults - 图1note

Before you create an experiment using Chaos Dashboard, make sure the following requirements are met:

  1. Chaos Dashboard is installed.

  2. Chaos Dashboard can be accessed via kubectl port-forward:

    1. kubectl port-forward -n chaos-testing svc/chaos-dashboard 2333:2333

    Then you can access the dashboard via http://localhost:2333 in your browser.

  3. Open Chaos Dashboard, and click NEW EXPERIMENT on the page to create a new experiment:

    img

  4. In the Choose a Target area, choose AWS FAULT and select a specific behavior, such as STOP EC2.

  5. Fill out the experiment information, and specify the experiment scope and the scheduled experiment duration.

  6. Submit the experiment information.

Create experiments using the YAML file

An ec2-stop configuration example

  1. Write the experiment configuration to the awschaos-ec2-stop.yaml file, as shown below:

    1. apiVersion: chaos-mesh.org/v1alpha1
    2. kind: AWSChaos
    3. metadata:
    4. name: ec2-stop-example
    5. namespace: chaos-testing
    6. spec:
    7. action: ec2-stop
    8. secretName: 'cloud-key-secret'
    9. awsRegion: 'us-east-2'
    10. ec2Instance: 'your-ec2-instance-id'
    11. duration: '5m'

    Based on this configuration example, Chaos Mesh will inject the ec2-stop fault into the specified EC2 instance so that the EC2 instance will be unavailable in 5 minutes.

    For more information about stopping EC2 instances, refer to AWS documentation - Stop and start your instance.

  2. After the configuration file is prepared, use kubectl to create an experiment:

    1. kubectl apply -f awschaos-ec2-stop.yaml

An ec2-start configuration example

  1. Write the experiment configuration to the awchaos-ec2-restot.yaml file:

    1. apiVersion: chaos-mesh.org/v1alpha1
    2. kind: AWSChaos
    3. metadata:
    4. name: ec2-restart-example
    5. namespace: chaos-testing
    6. spec:
    7. action: ec2-restart
    8. secretName: 'cloud-key-secret'
    9. awsRegion: 'us-east-2'
    10. ec2Instance: 'your-ec2-instance-id'

    Based on this configuration example, Chaos Mesh will inject ec2-restart fault into the specified EC2 instance so that the EC2 instance will be restarted.

    For more information about restarting the EC2 instance, refer to the AWS documentation - Reboot your instance.

  2. After the configuration file is prepared, use kubectl to create an experiment:

    1. kubectl apply -f awschaos-ec2-restart.yaml

A detach-volume configuration example

  1. Write the experiment configuration to the awschaos-detach-volume.yaml file:

    1. apiVersion: chaos-mesh.org/v1alpha1
    2. kind: AWSChaos
    3. metadata:
    4. name: ec2-detach-volume-example
    5. namespace: chaos-testing
    6. spec:
    7. action: ec2-stop
    8. secretName: 'cloud-key-secret'
    9. awsRegion: 'us-east-2'
    10. ec2Instance: 'your-ec2-instance-id'
    11. volumeID: 'your-volume-id'
    12. deviceName: '/dev/sdf'
    13. duration: '5m'

    Based on this configuration example, Chaos Mesh will inject a detail-volume fault into the specified EC2 instance so that the EC2 instance is detached from the specified storage volume within 5 minutes.

    For more information about detaching Amazon EBS volumes, refer to the AWS documentation - Detach an Amazon EBS volume from a Linux instance.

  2. After the configuration file is prepared, use kubectl to create an experiment:

    1. kubectl apply -f awschaos-detach-volume.yaml

Field description

The following table shows the fields in the YAML configuration file.

ParameterTypeDescriptionDefault valueRequiredExample
actionstringIndicates the specific type of faults. Only ec2-stop, ec2-restore, and detain-volume are supported.ec2-stopYesec2-stop
modestringSpecifies the mode of the experiment. The mode options include one (selecting a random Pod), all (selecting all eligible Pods), fixed (selecting a specified number of eligible Pods), fixed-percent (selecting a specified percentage of Pods from the eligible Pods), and random-max-percent (selecting the maximum percentage of Pods from the eligible Pods).NoneYesone
valuestringProvides parameters for the mode configuration, depending on mode.For example, when mode is set to fixed-percent, value specifies the percentage of Pods.NoneNo1
secretNamestringSpecifies the name of the Kubernetes Secret that stores the AWS authentication information.NoneNocloud-key-secret
awsRegionstringSpecifies the AWS region.NoneYesus-east-2
ec2InstancestringSpecifies the ID of the EC2 instance.NoneYesyour-ec2-instance-id
volumeIDstringThis is a required field when the action is detach-volume. This field specifies the EBS volume ID.NoneNoyour-volume-id
deviceNamestringThis is a required field when the action is detach-volume. This field specifies the machine name.NoneNo/dev/sdf
durationstringSpecifies the duration of the experiment.NoneYes30s