IOChaos Experiment
This document walks you through the IOChaos experiment.
IOChaos allows you to simulate file system faults such as IO delay and read/write errors. It can inject delay and fault when your program is running IO system calls such as open
, read
, and write
.
Configuration file
Below is a sample YAML file of IOChaos:
apiVersion: chaos-mesh.org/v1alpha1
kind: IoChaos
metadata:
name: io-delay-example
spec:
action: latency
mode: one
selector:
labelSelectors:
app: etcd
volumePath: /var/run/etcd
path: '/var/run/etcd/**/*'
delay: '100ms'
percent: 50
duration: '400s'
scheduler:
cron: '@every 10m'
For more sample files, see examples. You can edit them as needed.
Field | Description | Sample Value |
---|---|---|
mode | Defines the mode of the selector. | one / all / fixed / fixed-percent / random-max-percent |
selector | Specifies the pods to be injected with IO chaos. | |
action | Represents the IOChaos actions. Refer to Available actions for IOChaos for more details. | delay / fault / attrOverride |
volumePath | The mount path of the target volume. | “/var/run/etcd” |
delay | Specifies the latency of the fault injection. The duration might be a string with a signed sequence of decimal numbers, each with an optional fraction and a unit suffix. Valid time units are “ns”, “us” (or “µs”), “ms”, “s”, “m”, and “h”. | “300ms” / “2h45m” |
errno | Defines the error code returned by an IO action. See common Linux system errors for more Linux system error codes. | 2 |
attr | Defines the attribute to be overridden and the corresponding value | examples |
percent | Defines the probability of injecting errors in percentage. | 100 (by default) |
path | Defines the path of files for injecting IOChaos actions. It should be a glob for the files which you want to inject fault or delay. It is base on glob pattern and should be in the volumePath directory. | “/var/run/etcd/*/“ |
methods | Defines the IO methods for injecting IOChaos actions. It is represented as an array of string. | open / read See the available methods for more details. |
duration | Represents the duration of a chaos action. The duration might be a string with the signed sequence of decimal numbers, each with an optional fraction and a unit suffix. | “300ms” / “2h45m” |
scheduler | Defines the scheduler rules for the running time of the chaos experiment. | see robfig/cron |
Usage
Assume that you are using examples/io-mixed-example.yaml
, you can run the following command to create a chaos experiment:
kubectl apply -f examples/io-mixed-example.yaml
IOChaos available actions
IOChaos currently supports the following actions:
- latency: IO latency action. You can specify the latency before the IO operation returns a result.
- fault: IO fault action. In this mode, IO operations returns an error.
- attrOverride: Override attributes of a file.
latency
If you are using the latency
action, you can edit the specification as below:
spec:
action: latency
delay: '1ms'
It will inject a latency of 1ms into the selected methods.
fault
If you are using the fault
action, you can edit the specification as below:
spec:
action: fault
errno: 32
The selected methods return error 32, which means broken pipe
.
attrOverride
If you are using the attrOverride
mode, you can edit the specification as below:
spec:
action: attrOverride
attr:
perm: 72
Then the permission of selected files will be overridden with 110 in octal, which means the files cannot be read or modified (without CAP_DAC_OVERRIDE). See available attributes for a list of all possible attributes to override.
Note:
Attributes could be cached by Linux kernel, so it might have no effect if your program had accessed it before.
Common Linux system errors
Common Linux system errors are as below:
1
: Operation not permitted2
: No such file or directory5
: I/O error6
: No such device or address12
: Out of memory16
: Device or resource busy17
: File exists20
: Not a directory22
: Invalid argument24
: Too many open files28
: No space left on device
Refer to related header files for more information.
Available methods
Available methods are as below:
- lookup
- forget
- getattr
- setattr
- readlink
- mknod
- mkdir
- unlink
- rmdir
- symlink
- rename
- link
- open
- read
- write
- flush
- release
- fsync
- opendir
- readdir
- releasedir
- fsyncdir
- statfs
- setxattr
- getxattr
- listxattr
- removexattr
- access
- create
- getlk
- setlk
- bmap
Available attributes
Available attributes and the meaning of them are listed here:
ino
, inode of a filesize
, total size, in bytesblocks
, number of 512B blocks allocatedatime
, time of last accessmtime
, time of last modificationctime
, time of last status changekind
, file type. It can benamedPipe
,charDevice
,blockDevice
,directory
,regularFile
,symlink
orsocket
perm
, permission of a filenlink
, number of hard linksuid
, user id of ownergid
, group id of ownerrdev
, device ID (if special file)