End-To-End Testing Framework

Introduction

Cilium uses Ginkgo as a testing framework for writing end-to-end tests which test Cilium all the way from the API level (e.g. importing policies, CLI) to the datapath (i.e, whether policy that is imported is enforced accordingly in the datapath). The tests in the test directory are built on top of Ginkgo. Ginkgo provides a rich framework for developing tests alongside the benefits of Golang (compilation-time checks, types, etc.). To get accustomed to the basics of Ginkgo, we recommend reading the Ginkgo Getting-Started Guide , as well as running example tests to get a feel for the Ginkgo workflow.

These test scripts will invoke vagrant to create virtual machine(s) to run the tests. The tests make heavy use of the Ginkgo focus concept to determine which VMs are necessary to run particular tests. All test names must begin with one of the following prefixes:

  • Runtime: Test cilium in a runtime environment running on a single node.
  • K8s: Create a small multi-node kubernetes environment for testing features beyond a single host, and for testing kubernetes-specific features.
  • Nightly: sets up a multinode Kubernetes cluster to run scale, performance, and chaos testing for Cilium.

Running End-To-End Tests

Running All Ginkgo Tests

Running all of the Ginkgo tests may take an hour or longer. To run all the ginkgo tests, invoke the make command as follows from the root of the cilium repository:

  1. $ sudo make -C test/ test

The first time that this is invoked, the testsuite will pull the testing VMs and provision Cilium into them. This may take several minutes, depending on your internet connection speed. Subsequent runs of the test will reuse the image.

Running Runtime Tests

To run all of the runtime tests, execute the following command from the test directory:

  1. ginkgo --focus="Runtime" --tags=integration_tests

Ginkgo searches for all tests in all subdirectories that are “named” beginning with the string “Runtime” and contain any characters after it. For instance, here is an example showing what tests will be ran using Ginkgo’s dryRun option:

  1. $ ginkgo --focus="Runtime" -dryRun --tags=integration_tests
  2. Running Suite: runtime
  3. ======================
  4. Random Seed: 1516125117
  5. Will run 42 of 164 specs
  6. ................
  7. RuntimePolicyEnforcement Policy Enforcement Always
  8. Always to Never with policy
  9. /Users/ianvernon/go/src/github.com/cilium/cilium/test/runtime/Policies.go:258
  10. ------------------------------
  11. RuntimePolicyEnforcement Policy Enforcement Always
  12. Always to Never without policy
  13. /Users/ianvernon/go/src/github.com/cilium/cilium/test/runtime/Policies.go:293
  14. ------------------------------
  15. RuntimePolicyEnforcement Policy Enforcement Never
  16. Container creation
  17. /Users/ianvernon/go/src/github.com/cilium/cilium/test/runtime/Policies.go:332
  18. ------------------------------
  19. RuntimePolicyEnforcement Policy Enforcement Never
  20. Never to default with policy
  21. /Users/ianvernon/go/src/github.com/cilium/cilium/test/runtime/Policies.go:349
  22. .................
  23. Ran 42 of 164 Specs in 0.002 seconds
  24. SUCCESS! -- 0 Passed | 0 Failed | 0 Pending | 122 Skipped PASS
  25. Ginkgo ran 1 suite in 1.830262168s
  26. Test Suite Passed

The output has been truncated. For more information about this functionality, consult the aforementioned Ginkgo documentation.

Running Kubernetes Tests

To run all of the Kubernetes tests, run the following command from the test directory:

  1. ginkgo --focus="K8s" --tags=integration_tests

To run a specific test from the Kubernetes tests suite, run the following command from the test directory:

  1. ginkgo --focus="K8s.*Check iptables masquerading with random-fully" --tags=integration_tests

Similar to the Runtime test suite, Ginkgo searches for all tests in all subdirectories that are “named” beginning with the string “K8s” and contain any characters after it.

The Kubernetes tests support the following Kubernetes versions:

  • 1.16
  • 1.17
  • 1.18
  • 1.19
  • 1.20
  • 1.21
  • 1.22
  • 1.23

By default, the Vagrant VMs are provisioned with Kubernetes 1.23. To run with any other supported version of Kubernetes, run the test suite with the following format:

  1. K8S_VERSION=<version> ginkgo --focus="K8s" --tags=integration_tests

Note

When provisioning VMs with the net-next kernel (NETNEXT=1) on VirtualBox which version does not match a version of the VM image VirtualBox Guest Additions, Vagrant will install a new version of the Additions with mount.vboxsf. The latter is not compatible with vboxsf.ko shipped within the VM image, and thus syncing of shared folders will not work.

To avoid this, one can prevent Vagrant from installing the Additions by putting the following into $HOME/.vagrant.d/Vagrantfile:

  1. Vagrant.configure('2') do |config|
  2. if Vagrant.has_plugin?("vagrant-vbguest") then
  3. config.vbguest.auto_update = false
  4. end
  5. config.vm.provider :virtualbox do |vbox|
  6. vbox.check_guest_additions = false
  7. end
  8. end

Running Nightly Tests

To run all of the Nightly tests, run the following command from the test directory:

  1. ginkgo --focus="Nightly" --tags=integration_tests

Similar to the other test suites, Ginkgo searches for all tests in all subdirectories that are “named” beginning with the string “Nightly” and contain any characters after it. The default version of running Nightly test are 1.8, but can be changed using the environment variable K8S_VERSION.

Available CLI Options

For more advanced workflows, check the list of available custom options for the Cilium framework in the test/ directory and interact with ginkgo directly:

  1. $ cd test/
  2. $ ginkgo . -- -cilium.help
  3. -cilium.SSHConfig string
  4. Specify a custom command to fetch SSH configuration (eg: 'vagrant ssh-config')
  5. -cilium.benchmarks
  6. Specifies benchmark tests should be run which may increase test time
  7. -cilium.help
  8. Display this help message.
  9. -cilium.holdEnvironment
  10. On failure, hold the environment in its current state
  11. -cilium.hubble-relay-image string
  12. Specifies which image of hubble-relay to use during tests
  13. -cilium.hubble-relay-tag string
  14. Specifies which tag of hubble-relay to use during tests
  15. -cilium.image string
  16. Specifies which image of cilium to use during tests
  17. -cilium.kubeconfig string
  18. Kubeconfig to be used for k8s tests
  19. -cilium.multinode
  20. Enable tests across multiple nodes. If disabled, such tests may silently pass (default true)
  21. -cilium.operator-image string
  22. Specifies which image of cilium-operator to use during tests
  23. -cilium.operator-tag string
  24. Specifies which tag of cilium-operator to use during tests
  25. -cilium.passCLIEnvironment
  26. Pass the environment invoking ginkgo, including PATH, to subcommands
  27. -cilium.provision
  28. Provision Vagrant boxes and Cilium before running test (default true)
  29. -cilium.provision-k8s
  30. Specifies whether Kubernetes should be deployed and installed via kubeadm or not (default true)
  31. -cilium.runQuarantined
  32. Run tests that are under quarantine.
  33. -cilium.showCommands
  34. Output which commands are ran to stdout
  35. -cilium.skipLogs
  36. skip gathering logs if a test fails
  37. -cilium.tag string
  38. Specifies which tag of cilium to use during tests
  39. -cilium.testScope string
  40. Specifies scope of test to be ran (k8s, Nightly, runtime)
  41. -cilium.timeout duration
  42. Specifies timeout for test run (default 24h0m0s)
  43. Ginkgo ran 1 suite in 4.312100241s
  44. Test Suite Failed

For more information about other built-in options to Ginkgo, consult the Ginkgo documentation.

Running Specific Tests Within a Test Suite

If you want to run one specified test, there are a few options:

  • By modifying code: add the prefix “FIt” on the test you want to run; this marks the test as focused. Ginkgo will skip other tests and will only run the “focused” test. For more information, consult the Focused Specs documentation from Ginkgo.

    1. It("Example test", func(){
    2. Expect(true).Should(BeTrue())
    3. })
    4. FIt("Example focused test", func(){
    5. Expect(true).Should(BeTrue())
    6. })
  • From the command line: specify a more granular focus if you want to focus on, say, Runtime L7 tests:

    1. ginkgo --focus "Runtime.*L7" --tags=integration_tests

This will focus on tests that contain “Runtime”, followed by any number of any characters, followed by “L7”. --focus is a regular expression and quotes are required if it contains spaces and to escape shell expansion of *.

Compiling the tests without running them

To validate that the Go code you’ve written for testing is correct without needing to run the full test, you can build the test directory:

  1. make -C test/ build

Test Reports

The Cilium Ginkgo framework formulates JUnit reports for each test. The following files currently are generated depending upon the test suite that is ran:

  • runtime.xml
  • K8s.xml

Best Practices for Writing Tests

  • Provide informative output to console during a test using the By construct. This helps with debugging and gives those who did not write the test a good idea of what is going on. The lower the barrier of entry is for understanding tests, the better our tests will be!
  • Leave the testing environment in the same state that it was in when the test started by deleting resources, resetting configuration, etc.
  • Gather logs in the case that a test fails. If a test fails while running on Jenkins, a postmortem needs to be done to analyze why. So, dumping logs to a location where Jenkins can pick them up is of the highest imperative. Use the following code in an AfterFailed method:
  1. AfterFailed(func() {
  2. vm.ReportFailed()
  3. })

Ginkgo Extensions

In Cilium, some Ginkgo features are extended to cover some uses cases that are useful for testing Cilium.

BeforeAll

This function will run before all BeforeEach within a Describe or Context. This method is an equivalent to SetUp or initialize functions in common unit test frameworks.

AfterAll

This method will run after all AfterEach functions defined in a Describe or Context. This method is used for tearing down objects created which are used by all Its within the given Context or Describe. It is ran after all Its have ran, this method is a equivalent to tearDown or finalize methods in common unit test frameworks.

A good use case for using AfterAll method is to remove containers or pods that are needed for multiple Its in the given Context or Describe.

JustAfterEach

This method will run just after each test and before AfterFailed and AfterEach. The main reason of this method is to perform some assertions for a group of tests. A good example of using a global JustAfterEach function is for deadlock detection, which checks the Cilium logs for deadlocks that may have occurred in the duration of the tests.

AfterFailed

This method will run before all AfterEach and after JustAfterEach. This function is only called when the test failed.This construct is used to gather logs, the status of Cilium, etc, which provide data for analysis when tests fail.

Example Test Layout

Here is an example layout of how a test may be written with the aforementioned constructs:

Test description diagram:

  1. Describe
  2. BeforeAll(A)
  3. AfterAll(A)
  4. AfterFailed(A)
  5. AfterEach(A)
  6. JustAfterEach(A)
  7. TESTA1
  8. TESTA2
  9. TESTA3
  10. Context
  11. BeforeAll(B)
  12. AfterAll(B)
  13. AfterFailed(B)
  14. AfterEach(B)
  15. JustAfterEach(B)
  16. TESTB1
  17. TESTB2
  18. TESTB3

Test execution flow:

  1. Describe
  2. BeforeAll
  3. TESTA1; JustAfterEach(A), AfterFailed(A), AfterEach(A)
  4. TESTA2; JustAfterEach(A), AfterFailed(A), AfterEach(A)
  5. TESTA3; JustAfterEach(A), AfterFailed(A), AfterEach(A)
  6. Context
  7. BeforeAll(B)
  8. TESTB1:
  9. JustAfterEach(B); JustAfterEach(A)
  10. AfterFailed(B); AfterFailed(A);
  11. AfterEach(B) ; AfterEach(A);
  12. TESTB2:
  13. JustAfterEach(B); JustAfterEach(A)
  14. AfterFailed(B); AfterFailed(A);
  15. AfterEach(B) ; AfterEach(A);
  16. TESTB3:
  17. JustAfterEach(B); JustAfterEach(A)
  18. AfterFailed(B); AfterFailed(A);
  19. AfterEach(B) ; AfterEach(A);
  20. AfterAll(B)
  21. AfterAll(A)

Debugging:

You can retrieve all run commands and their output in the report directory (./test/test_results). Each test creates a new folder, which contains a file called log where all information is saved, in case of a failing test an exhaustive data will be added.

  1. $ head test/test_results/RuntimeKafkaKafkaPolicyIngress/logs
  2. level=info msg=Starting testName=RuntimeKafka
  3. level=info msg="Vagrant: running command \"vagrant ssh-config runtime\""
  4. cmd: "sudo cilium status" exitCode: 0
  5. KVStore: Ok Consul: 172.17.0.3:8300
  6. ContainerRuntime: Ok
  7. Kubernetes: Disabled
  8. Kubernetes APIs: [""]
  9. Cilium: Ok OK
  10. NodeMonitor: Disabled
  11. Allocated IPv4 addresses:

Running with delve

Delve is a debugging tool for Go applications. If you want to run your test with delve, you should add a new breakpoint using runtime.BreakPoint() in the code, and run ginkgo using dlv.

Example how to run ginkgo using dlv:

  1. dlv test . --build-flags="-tags=integration_tests" -- --ginkgo.focus="Runtime" -ginkgo.v=true --cilium.provision=false

Running End-To-End Tests In Other Environments via kubeconfig

The end-to-end tests can be run with an arbitrary kubeconfig file. Normally the CI will use the kubernetes created via vagrant but this can be overridden with --cilium.kubeconfig. When used, ginkgo will not start a VM nor compile cilium. It will also skip some setup tasks like labeling nodes for testing.

This mode expects:

  • The current directory is cilium/test

  • A test focus with --focus. --focus="K8s" selects all kubernetes tests.

  • Cilium images as full URLs specified with the --cilium.image and --cilium.operator-image options.

  • A working kubeconfig with the --cilium.kubeconfig option

  • A populated K8S_VERSION environment variable set to the version of the cluster

  • If appropriate, set the CNI_INTEGRATION environment variable set to one of gke, eks, eks-chaining, microk8s or minikube. This selects matching configuration overrides for cilium. Leaving this unset for non-matching integrations is also correct.

    For k8s environments that invoke an authentication agent, such as EKS and aws-iam-authenticator, set --cilium.passCLIEnvironment=true

An example invocation is

  1. CNI_INTEGRATION=eks K8S_VERSION=1.16 ginkgo --focus="K8s" --tags=integration_tests -- -cilium.provision=false -cilium.kubeconfig=`echo ~/.kube/config` -cilium.image="quay.io/cilium/cilium-ci" -cilium.operator-image="quay.io/cilium/operator" -cilium.operator-suffix="-ci" -cilium.passCLIEnvironment=true

Running in GKE

1- Setup a cluster as in Quick Installation or utilize an existing cluster.

Note

You do not need to deploy Cilium in this step, as the End-To-End Testing Framework handles the deployment of Cilium.

Note

The tests require machines larger than n1-standard-4. This can be set with --machine-type n1-standard-4 on cluster creation.

2- Invoke the tests from cilium/test with options set as explained in Running End-To-End Tests In Other Environments via kubeconfig

Note

The tests require the NATIVE_CIDR environment variable to be set to the value of the cluster IPv4 CIDR returned by the gcloud container clusters describe command.

  1. export CLUSTER_NAME=cluster1
  2. export CLUSTER_ZONE=us-west2-a
  3. export NATIVE_CIDR="$(gcloud container clusters describe $CLUSTER_NAME --zone $CLUSTER_ZONE --format 'value(clusterIpv4Cidr)')"
  4. CNI_INTEGRATION=gke K8S_VERSION=1.17 ginkgo --focus="K8sDemo" --tags=integration_tests -- -cilium.provision=false -cilium.kubeconfig=`echo ~/.kube/config` -cilium.image="quay.io/cilium/cilium-ci" -cilium.operator-image="quay.io/cilium/operator" -cilium.operator-suffix="-ci" -cilium.hubble-relay-image="quay.io/cilium/hubble-relay-ci" -cilium.passCLIEnvironment=true

Note

The kubernetes version defaults to 1.23 but can be configured with versions between 1.16 and 1.23. Version should match the server version reported by kubectl version.

AWS EKS (experimental)

Not all tests can succeed on EKS. Many do, however and may be useful. GitHub issue 9678#issuecomment-749350425 contains a list of tests that are still failing.

  1. Setup a cluster as in Quick Installation or utilize an existing cluster.
  2. Source the testing integration script from cilium/contrib/testing/integrations.sh.
  3. Invoke the gks function by passing which cilium docker image to run and the test focus. The command also accepts additional ginkgo arguments.
  1. gks quay.io/cilium/cilium:latest K8sDemo

Adding new Managed Kubernetes providers

All Managed Kubernetes test support relies on using a pre-configured kubeconfig file. This isn’t always adequate, however, and adding defaults specific to each provider is possible. The commit adding GKE support is a good reference.

  1. Add a map of helm settings to act as an override for this provider in test/helpers/kubectl.go. These should be the helm settings used when generating cilium specs for this provider.
  2. Add a unique CI Integration constant. This value is passed in when invoking ginkgo via the CNI_INTEGRATON environment variable.
  3. Update the helm overrides mapping with the constant and the helm settings.
  4. For cases where a test should be skipped use the SkipIfIntegration. To skip whole contexts, use SkipContextIf. More complex logic can be expressed with functions like IsIntegration. These functions are all part of the test/helpers package.

Running End-To-End Tests In Other Environments via SSH

If you want to run tests in an arbitrary environment with SSH access, you can use --cilium.SSHConfig to provide the SSH configuration of the endpoint on which tests will be run. The tests presume the following on the remote instance:

  • Cilium source code is located in the directory /home/vagrant/go/src/github.com/cilium/cilium/.
  • Cilium is installed and running.

The ssh connection needs to be defined as a ssh-config file and need to have the following targets:

  • runtime: To run runtime tests
  • k8s{1..2}-${K8S_VERSION}: to run Kubernetes tests. These instances must have Kubernetes installed and running as a prerequisite for running tests.

An example ssh-config can be the following:

  1. Host runtime
  2. HostName 127.0.0.1
  3. User vagrant
  4. Port 2222
  5. UserKnownHostsFile /dev/null
  6. StrictHostKeyChecking no
  7. PasswordAuthentication no
  8. IdentityFile /home/eloy/.go/src/github.com/cilium/cilium/test/.vagrant/machines/runtime/virtualbox/private_key
  9. IdentitiesOnly yes
  10. LogLevel FATAL

To run this you can use the following command:

  1. ginkgo --tags=integration_tests -- --cilium.provision=false --cilium.SSHConfig="cat ssh-config"

VMs for Testing

The VMs used for testing are defined in test/Vagrantfile. There are a variety of configuration options that can be passed as environment variables:

ENV variableDefault ValueOptionsDescription
K8S_NODES20..100Number of Kubernetes nodes in the cluster
NFS01If Cilium folder needs to be shared using NFS
IPv600-1If 1 the Kubernetes cluster will use IPv6
CONTAINER_RUNTIMEdockercontainerdTo set the default container runtime in the Kubernetes cluster
K8S_VERSION1.181.**Kubernetes version to install
KUBEPROXY10-1If 0 the Kubernetes’ kube-proxy won’t be installed
SERVER_BOXcilium/ubuntu-dev
Vagrantcloud base image
VM_CPUS20..100Number of CPUs that need to have the VM
VM_MEMORY4096d+RAM size in Megabytes

VM images

The test suite relies on Vagrant to automatically download the required VM image, if it is not already available on the system. VM images weight several gigabytes so this may take some time, but faster tools such as aria2 can speed up the process by opening multiple connections. The script test/packet/scripts/add_vagrant_box.sh can be useful to manually download selected images with aria2 prior to launching the test suite, or to periodically update images in a cron job:

  1. $ bash test/packet/scripts/add_vagrant_box.sh -h
  2. usage: add_vagrant_box.sh [options] [vagrant_box_defaults.rb path]
  3. path to vagrant_box_defaults.rb defaults to ./vagrant_box_defaults.rb
  4. options:
  5. -a use aria2c instead of curl
  6. -b <box> download selected box (defaults: ubuntu ubuntu-next)
  7. -l download latest versions instead of using vagrant_box_defaults
  8. -t download to /tmp/ instead of current directory
  9. -h display this help
  10. examples:
  11. download boxes ubuntu and ubuntu-next from vagrant_box_defaults.rb:
  12. $ add-vagrant-boxes.sh $HOME/go/src/github.com/cilium/cilium/vagrant_box_defaults.rb
  13. download latest version for ubuntu-dev and ubuntu-next:
  14. $ add-vagrant-boxes.sh -l -b ubuntu-dev -b ubuntu-next
  15. same as above, downloading into /tmp/ and using aria2c:
  16. $ add-vagrant-boxes.sh -alt -b ubuntu-dev -b ubuntu-next

Known Issues and Workarounds

If you see the following error, take a look at this GitHub issue for workarounds.

  1. A host only network interface you're attempting to configure via DHCP
  2. already has a conflicting host only adapter with DHCP enabled. The
  3. DHCP on this adapter is incompatible with the DHCP settings. Two
  4. host only network interfaces are not allowed to overlap, and each
  5. host only network interface can have only one DHCP server. Please
  6. reconfigure your host only network or remove the virtual machine
  7. using the other host only network.

Also, consider upgrading VirtualBox and Vagrant to the latest versions.

Further Assistance

Have a question about how the tests work or want to chat more about improving the testing infrastructure for Cilium? Hop on over to the testing channel on Slack.