Troubleshooting

This section provides resolution steps for common problems reported with the linkerd check command.

The “pre-kubernetes-cluster-setup” checks

These checks only run when the --pre flag is set. This flag is intended for use prior to running linkerd install, to verify your cluster is prepared for installation.

√ control plane namespace does not already exist

Example failure:

  1. × control plane namespace does not already exist
  2. The "linkerd" namespace already exists

By default linkerd install will create a linkerd namespace. Prior to installation, that namespace should not exist. To check with a different namespace, run:

  1. linkerd check --pre --linkerd-namespace linkerd-test

√ can create Kubernetes resources

The subsequent checks in this section validate whether you have permission to create the Kubernetes resources required for Linkerd installation, specifically:

  1. can create Namespaces
  2. can create ClusterRoles
  3. can create ClusterRoleBindings
  4. can create CustomResourceDefinitions

The “pre-kubernetes-setup” checks

These checks only run when the --pre flag is set This flag is intended for use prior to running linkerd install, to verify you have the correct RBAC permissions to install Linkerd.

  1. can create Namespaces
  2. can create ClusterRoles
  3. can create ClusterRoleBindings
  4. can create CustomResourceDefinitions
  5. can create PodSecurityPolicies
  6. can create ServiceAccounts
  7. can create Services
  8. can create Deployments
  9. can create ConfigMaps

√ no clock skew detected

This check detects any differences between the system running the linkerd install command and the Kubernetes nodes (known as clock skew). Having a substantial clock skew can cause TLS validation problems because a node may determine that a TLS certificate is expired when it should not be, or vice versa.

Linkerd version edge-20.3.4 and later check for a difference of at most 5 minutes and older versions of Linkerd (including stable-2.7) check for a difference of at most 1 minute. If your Kubernetes node heartbeat interval is longer than this difference, you may experience false positives of this check. The default node heartbeat interval was increased to 5 minutes in Kubernetes 1.17 meaning that users running Linkerd versions prior to edge-20.3.4 on Kubernetes 1.17 or later are likely to experience these false positives. If this is the case, you can upgrade to Linkerd edge-20.3.4 or later. If you choose to ignore this error, we strongly recommend that you verify that your system clocks are consistent.

The “pre-kubernetes-capability” checks

These checks only run when the --pre flag is set. This flag is intended for use prior to running linkerd install, to verify you have the correct Kubernetes capability permissions to install Linkerd.

The “pre-linkerd-global-resources” checks

These checks only run when the --pre flag is set. This flag is intended for use prior to running linkerd install, to verify you have not already installed the Linkerd control plane.

  1. no ClusterRoles exist
  2. no ClusterRoleBindings exist
  3. no CustomResourceDefinitions exist
  4. no MutatingWebhookConfigurations exist
  5. no ValidatingWebhookConfigurations exist
  6. no PodSecurityPolicies exist

The “pre-kubernetes-single-namespace-setup” checks

If you do not expect to have the permission for a full cluster install, try the --single-namespace flag, which validates if Linkerd can be installed in a single namespace, with limited cluster access:

  1. linkerd check --pre --single-namespace

The “kubernetes-api” checks

Example failures:

  1. × can initialize the client
  2. error configuring Kubernetes API client: stat badconfig: no such file or directory
  3. × can query the Kubernetes API
  4. Get https://8.8.8.8/version: dial tcp 8.8.8.8:443: i/o timeout

Ensure that your system is configured to connect to a Kubernetes cluster. Validate that the KUBECONFIG environment variable is set properly, and/or ~/.kube/config points to a valid cluster.

For more information see these pages in the Kubernetes Documentation:

Also verify that these command works:

  1. kubectl config view
  2. kubectl cluster-info
  3. kubectl version

Another example failure:

  1. can query the Kubernetes API
  2. Get REDACTED/version: x509: certificate signed by unknown authority

As an (unsafe) workaround to this, you may try:

  1. kubectl config set-cluster ${KUBE_CONTEXT} --insecure-skip-tls-verify=true \
  2. --server=${KUBE_CONTEXT}

The “kubernetes-version” checks

√ is running the minimum Kubernetes API version

Example failure:

  1. × is running the minimum Kubernetes API version
  2. Kubernetes is on version [1.7.16], but version [1.13.0] or more recent is required

Linkerd requires at least version 1.13.0. Verify your cluster version with:

  1. kubectl version

√ is running the minimum kubectl version

Example failure:

  1. × is running the minimum kubectl version
  2. kubectl is on version [1.9.1], but version [1.13.0] or more recent is required
  3. see https://linkerd.io/2/checks/#kubectl-version for hints

Linkerd requires at least version 1.13.0. Verify your kubectl version with:

  1. kubectl version --client --short

To fix please update kubectl version.

For more information on upgrading Kubernetes, see the page in the Kubernetes Documentation.

The “linkerd-config” checks

This category of checks validates that Linkerd’s cluster-wide RBAC and related resources have been installed.

√ control plane Namespace exists

Example failure:

  1. × control plane Namespace exists
  2. The "foo" namespace does not exist
  3. see https://linkerd.io/2/checks/#l5d-existence-ns for hints

Ensure the Linkerd control plane namespace exists:

  1. kubectl get ns

The default control plane namespace is linkerd. If you installed Linkerd into a different namespace, specify that in your check command:

  1. linkerd check --linkerd-namespace linkerdtest

√ control plane ClusterRoles exist

Example failure:

  1. × control plane ClusterRoles exist
  2. missing ClusterRoles: linkerd-linkerd-identity
  3. see https://linkerd.io/2/checks/#l5d-existence-cr for hints

Ensure the Linkerd ClusterRoles exist:

  1. $ kubectl get clusterroles | grep linkerd
  2. linkerd-linkerd-destination 9d
  3. linkerd-linkerd-identity 9d
  4. linkerd-linkerd-proxy-injector 9d
  5. linkerd-policy 9d

Also ensure you have permission to create ClusterRoles:

  1. $ kubectl auth can-i create clusterroles
  2. yes

√ control plane ClusterRoleBindings exist

Example failure:

  1. × control plane ClusterRoleBindings exist
  2. missing ClusterRoleBindings: linkerd-linkerd-identity
  3. see https://linkerd.io/2/checks/#l5d-existence-crb for hints

Ensure the Linkerd ClusterRoleBindings exist:

  1. $ kubectl get clusterrolebindings | grep linkerd
  2. linkerd-linkerd-destination 9d
  3. linkerd-linkerd-identity 9d
  4. linkerd-linkerd-proxy-injector 9d
  5. linkerd-destination-policy 9d

Also ensure you have permission to create ClusterRoleBindings:

  1. $ kubectl auth can-i create clusterrolebindings
  2. yes

√ control plane ServiceAccounts exist

Example failure:

  1. × control plane ServiceAccounts exist
  2. missing ServiceAccounts: linkerd-identity
  3. see https://linkerd.io/2/checks/#l5d-existence-sa for hints

Ensure the Linkerd ServiceAccounts exist:

  1. $ kubectl -n linkerd get serviceaccounts
  2. NAME SECRETS AGE
  3. default 1 14m
  4. linkerd-destination 1 14m
  5. linkerd-heartbeat 1 14m
  6. linkerd-identity 1 14m
  7. linkerd-proxy-injector 1 14m

Also ensure you have permission to create ServiceAccounts in the Linkerd namespace:

  1. $ kubectl -n linkerd auth can-i create serviceaccounts
  2. yes

√ control plane CustomResourceDefinitions exist

Example failure:

  1. × control plane CustomResourceDefinitions exist
  2. missing CustomResourceDefinitions: serviceprofiles.linkerd.io
  3. see https://linkerd.io/2/checks/#l5d-existence-crd for hints

Ensure the Linkerd CRD exists:

  1. $ kubectl get customresourcedefinitions
  2. NAME CREATED AT
  3. serviceprofiles.linkerd.io 2019-04-25T21:47:31Z

Also ensure you have permission to create CRDs:

  1. $ kubectl auth can-i create customresourcedefinitions
  2. yes

√ control plane MutatingWebhookConfigurations exist

Example failure:

  1. × control plane MutatingWebhookConfigurations exist
  2. missing MutatingWebhookConfigurations: linkerd-proxy-injector-webhook-config
  3. see https://linkerd.io/2/checks/#l5d-existence-mwc for hints

Ensure the Linkerd MutatingWebhookConfigurations exists:

  1. $ kubectl get mutatingwebhookconfigurations | grep linkerd
  2. linkerd-proxy-injector-webhook-config 2019-07-01T13:13:26Z

Also ensure you have permission to create MutatingWebhookConfigurations:

  1. $ kubectl auth can-i create mutatingwebhookconfigurations
  2. yes

√ control plane ValidatingWebhookConfigurations exist

Example failure:

  1. × control plane ValidatingWebhookConfigurations exist
  2. missing ValidatingWebhookConfigurations: linkerd-sp-validator-webhook-config
  3. see https://linkerd.io/2/checks/#l5d-existence-vwc for hints

Ensure the Linkerd ValidatingWebhookConfiguration exists:

  1. $ kubectl get validatingwebhookconfigurations | grep linkerd
  2. linkerd-sp-validator-webhook-config 2019-07-01T13:13:26Z

Also ensure you have permission to create ValidatingWebhookConfigurations:

  1. $ kubectl auth can-i create validatingwebhookconfigurations
  2. yes

√ proxy-init container runs as root if docker container runtime is used

Example failure:

  1. × proxy-init container runs as root user if docker container runtime is used
  2. there are nodes using the docker container runtime and proxy-init container must run as root user.
  3. try installing linkerd via --set proxyInit.runAsRoot=true
  4. see https://linkerd.io/2/checks/#l5d-proxy-init-run-as-root for hints

Kubernetes nodes running with docker as the container runtime (CRI) require the init container to run as root for iptables.

Newer distributions of managed k8s use containerd where this is not an issue.

Without root in the init container you might get errors such as:

  1. time="2021-11-15T04:41:31Z" level=info msg="iptables-save -t nat"
  2. Error: exit status 1
  3. time="2021-11-15T04:41:31Z" level=info msg="iptables-save v1.8.7 (legacy): Cannot initialize: Permission denied (you must be root)\n\n"

See linkerd/linkerd2#7283 and linkerd/linkerd2#7308 for further details.

The “linkerd-existence” checks

√ ’linkerd-config’ config map exists

Example failure:

  1. × 'linkerd-config' config map exists
  2. missing ConfigMaps: linkerd-config
  3. see https://linkerd.io/2/checks/#l5d-existence-linkerd-config for hints

Ensure the Linkerd ConfigMap exists:

  1. $ kubectl -n linkerd get configmap/linkerd-config
  2. NAME DATA AGE
  3. linkerd-config 3 61m

Also ensure you have permission to create ConfigMaps:

  1. $ kubectl -n linkerd auth can-i create configmap
  2. yes

√ control plane replica sets are ready

This failure occurs when one of Linkerd’s ReplicaSets fails to schedule a pod.

For more information, see the Kubernetes documentation on Failed Deployments.

√ no unschedulable pods

Example failure:

  1. × no unschedulable pods
  2. linkerd-prometheus-6b668f774d-j8ncr: 0/1 nodes are available: 1 Insufficient cpu.
  3. see https://linkerd.io/2/checks/#l5d-existence-unschedulable-pods for hints

For more information, see the Kubernetes documentation on the Unschedulable Pod Condition.

The “linkerd-identity” checks

√ certificate config is valid

Example failures:

  1. × certificate config is valid
  2. key ca.crt containing the trust anchors needs to exist in secret linkerd-identity-issuer if --identity-external-issuer=true
  3. see https://linkerd.io/2/checks/#l5d-identity-cert-config-valid
  1. × certificate config is valid
  2. key crt.pem containing the issuer certificate needs to exist in secret linkerd-identity-issuer if --identity-external-issuer=false
  3. see https://linkerd.io/2/checks/#l5d-identity-cert-config-valid

Ensure that your linkerd-identity-issuer secret contains the correct keys for the scheme that Linkerd is configured with. If the scheme is kubernetes.io/tls your secret should contain the tls.crt, tls.key and ca.crt keys. Alternatively if your scheme is linkerd.io/tls, the required keys are crt.pem and key.pem.

√ trust roots are using supported crypto algorithm

Example failure:

  1. × trust roots are using supported crypto algorithm
  2. Invalid roots:
  3. * 165223702412626077778653586125774349756 identity.linkerd.cluster.local must use P-256 curve for public key, instead P-521 was used
  4. see https://linkerd.io/2/checks/#l5d-identity-trustAnchors-use-supported-crypto

You need to ensure that all of your roots use ECDSA P-256 for their public key algorithm.

√ trust roots are within their validity period

Example failure:

  1. × trust roots are within their validity period
  2. Invalid roots:
  3. * 199607941798581518463476688845828639279 identity.linkerd.cluster.local not valid anymore. Expired on 2019-12-19T13:08:18Z
  4. see https://linkerd.io/2/checks/#l5d-identity-trustAnchors-are-time-valid for hints

Failures of such nature indicate that your roots have expired. If that is the case you will have to update both the root and issuer certificates at once. You can follow the process outlined in Replacing Expired Certificates to get your cluster back to a stable state.

√ trust roots are valid for at least 60 days

Example warnings:

  1. trust roots are valid for at least 60 days
  2. Roots expiring soon:
  3. * 66509928892441932260491975092256847205 identity.linkerd.cluster.local will expire on 2019-12-19T13:30:57Z
  4. see https://linkerd.io/2/checks/#l5d-identity-trustAnchors-not-expiring-soon for hints

This warning indicates that the expiry of some of your roots is approaching. In order to address this problem without incurring downtime, you can follow the process outlined in Rotating your identity certificates.

√ issuer cert is using supported crypto algorithm

Example failure:

  1. × issuer cert is using supported crypto algorithm
  2. issuer certificate must use P-256 curve for public key, instead P-521 was used
  3. see https://linkerd.io/2/checks/#5d-identity-issuer-cert-uses-supported-crypto for hints

You need to ensure that your issuer certificate uses ECDSA P-256 for its public key algorithm. You can refer to Generating your own mTLS root certificates to see how you can generate certificates that will work with Linkerd.

√ issuer cert is within its validity period

Example failure:

  1. × issuer cert is within its validity period
  2. issuer certificate is not valid anymore. Expired on 2019-12-19T13:35:49Z
  3. see https://linkerd.io/2/checks/#l5d-identity-issuer-cert-is-time-valid

This failure indicates that your issuer certificate has expired. In order to bring your cluster back to a valid state, follow the process outlined in Replacing Expired Certificates.

√ issuer cert is valid for at least 60 days

Example warning:

  1. issuer cert is valid for at least 60 days
  2. issuer certificate will expire on 2019-12-19T13:35:49Z
  3. see https://linkerd.io/2/checks/#l5d-identity-issuer-cert-not-expiring-soon for hints

This warning means that your issuer certificate is expiring soon. If you do not rely on external certificate management solution such as cert-manager, you can follow the process outlined in Rotating your identity certificates

√ issuer cert is issued by the trust root

Example error:

  1. × issuer cert is issued by the trust root
  2. x509: certificate signed by unknown authority (possibly because of "x509: ECDSA verification failure" while trying to verify candidate authority certificate "identity.linkerd.cluster.local")
  3. see https://linkerd.io/2/checks/#l5d-identity-issuer-cert-issued-by-trust-anchor for hints

This error indicates that the issuer certificate that is in the linkerd-identity-issuer secret cannot be verified with any of the roots that Linkerd has been configured with. Using the CLI install process, this should never happen. If Helm was used for installation or the issuer certificates are managed by a malfunctioning certificate management solution, it is possible for the cluster to end up in such an invalid state. At that point the best to do is to use the upgrade command to update your certificates:

  1. linkerd upgrade \
  2. --identity-issuer-certificate-file=./your-new-issuer.crt \
  3. --identity-issuer-key-file=./your-new-issuer.key \
  4. --identity-trust-anchors-file=./your-new-roots.crt \
  5. --force | kubectl apply -f -

Once the upgrade process is over, the output of linkerd check --proxy should be:

  1. linkerd-identity
  2. ----------------
  3. certificate config is valid
  4. trust roots are using supported crypto algorithm
  5. trust roots are within their validity period
  6. trust roots are valid for at least 60 days
  7. issuer cert is using supported crypto algorithm
  8. issuer cert is within its validity period
  9. issuer cert is valid for at least 60 days
  10. issuer cert is issued by the trust root
  11. linkerd-identity-data-plane
  12. ---------------------------
  13. data plane proxies certificate match CA

The “linkerd-webhooks-and-apisvc-tls” checks

√ proxy-injector webhook has valid cert

Example failure:

  1. × proxy-injector webhook has valid cert
  2. secrets "linkerd-proxy-injector-tls" not found
  3. see https://linkerd.io/2/checks/#l5d-proxy-injector-webhook-cert-valid for hints

Ensure that the linkerd-proxy-injector-k8s-tls secret exists and contains the appropriate tls.crt and tls.key data entries. For versions before 2.9, the secret is named linkerd-proxy-injector-tls and it should contain the crt.pem and key.pem data entries.

  1. × proxy-injector webhook has valid cert
  2. cert is not issued by the trust anchor: x509: certificate is valid for xxxxxx, not linkerd-proxy-injector.linkerd.svc
  3. see https://linkerd.io/2/checks/#l5d-proxy-injector-webhook-cert-valid for hints

Here you need to make sure the certificate was issued specifically for linkerd-proxy-injector.linkerd.svc.

√ proxy-injector cert is valid for at least 60 days

Example failure:

  1. proxy-injector cert is valid for at least 60 days
  2. certificate will expire on 2020-11-07T17:00:07Z
  3. see https://linkerd.io/2/checks/#l5d-proxy-injector-webhook-cert-not-expiring-soon for hints

This warning indicates that the expiry of proxy-injnector webhook cert is approaching. In order to address this problem without incurring downtime, you can follow the process outlined in Automatically Rotating your webhook TLS Credentials.

√ sp-validator webhook has valid cert

Example failure:

  1. × sp-validator webhook has valid cert
  2. secrets "linkerd-sp-validator-tls" not found
  3. see https://linkerd.io/2/checks/#l5d-sp-validator-webhook-cert-valid for hints

Ensure that the linkerd-sp-validator-k8s-tls secret exists and contains the appropriate tls.crt and tls.key data entries. For versions before 2.9, the secret is named linkerd-sp-validator-tls and it should contain the crt.pem and key.pem data entries.

  1. × sp-validator webhook has valid cert
  2. cert is not issued by the trust anchor: x509: certificate is valid for xxxxxx, not linkerd-sp-validator.linkerd.svc
  3. see https://linkerd.io/2/checks/#l5d-sp-validator-webhook-cert-valid for hints

Here you need to make sure the certificate was issued specifically for linkerd-sp-validator.linkerd.svc.

√ sp-validator cert is valid for at least 60 days

Example failure:

  1. sp-validator cert is valid for at least 60 days
  2. certificate will expire on 2020-11-07T17:00:07Z
  3. see https://linkerd.io/2/checks/#l5d-sp-validator-webhook-cert-not-expiring-soon for hints

This warning indicates that the expiry of sp-validator webhook cert is approaching. In order to address this problem without incurring downtime, you can follow the process outlined in Automatically Rotating your webhook TLS Credentials.

√ policy-validator webhook has valid cert

Example failure:

  1. × policy-validator webhook has valid cert
  2. secrets "linkerd-policy-validator-tls" not found
  3. see https://linkerd.io/2/checks/#l5d-policy-validator-webhook-cert-valid for hints

Ensure that the linkerd-policy-validator-k8s-tls secret exists and contains the appropriate tls.crt and tls.key data entries.

  1. × policy-validator webhook has valid cert
  2. cert is not issued by the trust anchor: x509: certificate is valid for xxxxxx, not linkerd-policy-validator.linkerd.svc
  3. see https://linkerd.io/2/checks/#l5d-policy-validator-webhook-cert-valid for hints

Here you need to make sure the certificate was issued specifically for linkerd-policy-validator.linkerd.svc.

√ policy-validator cert is valid for at least 60 days

Example failure:

  1. policy-validator cert is valid for at least 60 days
  2. certificate will expire on 2020-11-07T17:00:07Z
  3. see https://linkerd.io/2/checks/#l5d-policy-validator-webhook-cert-not-expiring-soon for hints

This warning indicates that the expiry of policy-validator webhook cert is approaching. In order to address this problem without incurring downtime, you can follow the process outlined in Automatically Rotating your webhook TLS Credentials.

The “linkerd-identity-data-plane” checks

√ data plane proxies certificate match CA

Example warning:

  1. data plane proxies certificate match CA
  2. Some pods do not have the current trust bundle and must be restarted:
  3. * emojivoto/emoji-d8d7d9c6b-8qwfx
  4. * emojivoto/vote-bot-588499c9f6-zpwz6
  5. * emojivoto/voting-8599548fdc-6v64k
  6. see https://linkerd.io/2/checks/{#l5d-identity-data-plane-proxies-certs-match-ca for hints

Observing this warning indicates that some of your meshed pods have proxies that have stale certificates. This is most likely to happen during upgrade operations that deal with cert rotation. In order to solve the problem you can use rollout restart to restart the pods in question. That should cause them to pick the correct certs from the linkerd-config configmap. When upgrade is performed using the --identity-trust-anchors-file flag to modify the roots, the Linkerd components are restarted. While this operation is in progress the check --proxy command may output a warning, pertaining to the Linkerd components:

  1. data plane proxies certificate match CA
  2. Some pods do not have the current trust bundle and must be restarted:
  3. * linkerd/linkerd-sp-validator-75f9d96dc-rch4x
  4. * linkerd-viz/tap-68d8bbf64-mpzgb
  5. * linkerd-viz/web-849f74b7c6-qlhwc
  6. see https://linkerd.io/2/checks/{#l5d-identity-data-plane-proxies-certs-match-ca for hints

If that is the case, simply wait for the upgrade operation to complete. The stale pods should terminate and be replaced by new ones, configured with the correct certificates.

The “linkerd-api” checks

√ control plane pods are ready

Example failure:

  1. × control plane pods are ready
  2. No running pods for "linkerd-sp-validator"

Verify the state of the control plane pods with:

  1. $ kubectl -n linkerd get po
  2. NAME READY STATUS RESTARTS AGE
  3. linkerd-destination-5fd7b5d466-szgqm 2/2 Running 1 12m
  4. linkerd-identity-54df78c479-hbh5m 2/2 Running 0 12m
  5. linkerd-proxy-injector-67f8cf65f7-4tvt5 2/2 Running 1 12m

√ cluster networks can be verified

Example failure:

  1. cluster networks can be verified
  2. the following nodes do not expose a podCIDR:
  3. node-0
  4. see https://linkerd.io/2/checks/#l5d-cluster-networks-verified for hints

Linkerd has a clusterNetworks setting which allows it to differentiate between intra-cluster and egress traffic. Through each Node’s podCIDR field, Linkerd can verify that all possible Pod IPs are included in the clusterNetworks setting. When a Node is missing the podCIDR field, Linkerd can not verify this, and it’s possible that the Node creates a Pod with an IP outside of clusterNetworks; this may result in it not being meshed properly.

Nodes are not required to expose a podCIDR field which is why this results in a warning. Getting a Node to expose this field depends on the specific distribution being used.

√ cluster networks contains all node podCIDRs

Example failure:

  1. × cluster networks contains all node podCIDRs
  2. node has podCIDR(s) [10.244.0.0/24] which are not contained in the Linkerd clusterNetworks.
  3. Try installing linkerd via --set clusterNetworks=10.244.0.0/24
  4. see https://linkerd.io/2/checks/#l5d-cluster-networks-cidr for hints

Linkerd has a clusterNetworks setting which allows it to differentiate between intra-cluster and egress traffic. This warning indicates that the cluster has a podCIDR which is not included in Linkerd’s clusterNetworks. Traffic to pods in this network may not be meshed properly. To remedy this, update the clusterNetworks setting to include all pod networks in the cluster.

√ cluster networks contains all pods

Example failures:

  1. × the Linkerd clusterNetworks [10.244.0.0/24] do not include pod default/foo (104.21.63.202)
  2. see https://linkerd.io/2/checks/#l5d-cluster-networks-pods for hints
  1. × the Linkerd clusterNetworks [10.244.0.0/24] do not include svc default/bar (10.96.217.194)
  2. see https://linkerd.io/2/checks/#l5d-cluster-networks-pods for hints

Linkerd has a clusterNetworks setting which allows it to differentiate between intra-cluster and egress traffic. This warning indicates that the cluster has a pod or ClusterIP service which is not included in Linkerd’s clusterNetworks. Traffic to pods or services in this network may not be meshed properly. To remedy this, update the clusterNetworks setting to include all pod and service networks in the cluster.

The “linkerd-version” checks

√ can determine the latest version

Example failure:

  1. × can determine the latest version
  2. Get https://versioncheck.linkerd.io/version.json?version=edge-19.1.2&uuid=test-uuid&source=cli: context deadline exceeded

Ensure you can connect to the Linkerd version check endpoint from the environment the linkerd cli is running:

  1. $ curl "https://versioncheck.linkerd.io/version.json?version=edge-19.1.2&uuid=test-uuid&source=cli"
  2. {"stable":"stable-2.1.0","edge":"edge-19.1.2"}

√ cli is up-to-date

Example failure:

  1. cli is up-to-date
  2. is running version 19.1.1 but the latest edge version is 19.1.2

See the page on Upgrading Linkerd.

The “control-plane-version” checks

Example failures:

  1. control plane is up-to-date
  2. is running version 19.1.1 but the latest edge version is 19.1.2
  3. control plane and cli versions match
  4. mismatched channels: running stable-2.1.0 but retrieved edge-19.1.2

See the page on Upgrading Linkerd.

The “linkerd-control-plane-proxy” checks

√ control plane proxies are healthy

This error indicates that the proxies running in the Linkerd control plane are not healthy. Ensure that Linkerd has been installed with all of the correct setting or re-install Linkerd as necessary.

√ control plane proxies are up-to-date

This warning indicates the proxies running in the Linkerd control plane are running an old version. We recommend downloading the latest Linkerd release and Upgrading Linkerd.

√ control plane proxies and cli versions match

This warning indicates that the proxies running in the Linkerd control plane are running a different version from the Linkerd CLI. We recommend keeping this versions in sync by updating either the CLI or the control plane as necessary.

The “linkerd-data-plane” checks

These checks only run when the --proxy flag is set. This flag is intended for use after running linkerd inject, to verify the injected proxies are operating normally.

√ data plane namespace exists

Example failure:

  1. $ linkerd check --proxy --namespace foo
  2. ...
  3. × data plane namespace exists
  4. The "foo" namespace does not exist

Ensure the --namespace specified exists, or, omit the parameter to check all namespaces.

√ data plane proxies are ready

Example failure:

  1. × data plane proxies are ready
  2. No "linkerd-proxy" containers found

Ensure you have injected the Linkerd proxy into your application via the linkerd inject command.

For more information on linkerd inject, see Step 5: Install the demo app in our Getting Started guide.

√ data plane is up-to-date

Example failure:

  1. data plane is up-to-date
  2. linkerd/linkerd-prometheus-74d66f86f6-6t6dh: is running version 19.1.2 but the latest edge version is 19.1.3

See the page on Upgrading Linkerd.

√ data plane and cli versions match

  1. data plane and cli versions match
  2. linkerd/linkerd-identity-5f6c45d6d9-9hd9j: is running version 19.1.2 but the latest edge version is 19.1.3

See the page on Upgrading Linkerd.

√ data plane pod labels are configured correctly

Example failure:

  1. data plane pod labels are configured correctly
  2. Some labels on data plane pods should be annotations:
  3. * emojivoto/voting-ff4c54b8d-tv9pp
  4. linkerd.io/inject

linkerd.io/inject, config.linkerd.io/* or config.alpha.linkerd.io/* should be annotations in order to take effect.

√ data plane service labels are configured correctly

Example failure:

  1. data plane service labels and annotations are configured correctly
  2. Some labels on data plane services should be annotations:
  3. * emojivoto/emoji-svc
  4. config.linkerd.io/control-port

config.linkerd.io/* or config.alpha.linkerd.io/* should be annotations in order to take effect.

√ data plane service annotations are configured correctly

Example failure:

  1. data plane service annotations are configured correctly
  2. Some annotations on data plane services should be labels:
  3. * emojivoto/emoji-svc
  4. mirror.linkerd.io/exported

mirror.linkerd.io/exported should be a label in order to take effect.

√ opaque ports are properly annotated

Example failure:

  1. × opaque ports are properly annotated
  2. * service emoji-svc targets the opaque port 8080 through 8080; add 8080 to its config.linkerd.io/opaque-ports annotation
  3. see https://linkerd.io/2/checks/#linkerd-opaque-ports-definition for hints

If a Pod marks a port as opaque by using the config.linkerd.io/opaque-ports annotation, then any Service which targets that port must also use the config.linkerd.io/opaque-ports annotation to mark that port as opaque. Having a port marked as opaque on the Pod but not the Service (or vice versa) can cause inconsistent behavior depending on if traffic is sent to the Pod directly (for example with a headless Service) or through a ClusterIP Service. This error can be remedied by adding the config.linkerd.io/opaque-ports annotation to both the Pod and Service. See Protocol Detection for more information.

The “linkerd-ha-checks” checks

These checks are ran if Linkerd has been installed in HA mode.

√ pod injection disabled on kube-system

Example warning:

  1. pod injection disabled on kube-system
  2. kube-system namespace needs to have the label config.linkerd.io/admission-webhooks: disabled if HA mode is enabled
  3. see https://linkerd.io/2/checks/#l5d-injection-disabled for hints

Ensure the kube-system namespace has the config.linkerd.io/admission-webhooks:disabled label:

  1. $ kubectl get namespace kube-system -oyaml
  2. kind: Namespace
  3. apiVersion: v1
  4. metadata:
  5. name: kube-system
  6. annotations:
  7. linkerd.io/inject: disabled
  8. labels:
  9. config.linkerd.io/admission-webhooks: disabled

√ multiple replicas of control plane pods

Example warning:

  1. multiple replicas of control plane pods
  2. not enough replicas available for [linkerd-identity]
  3. see https://linkerd.io/2/checks/#l5d-control-plane-replicas for hints

This happens when one of the control plane pods doesn’t have at least two replicas running. This is likely caused by insufficient node resources.

The “extensions” checks

When any Extensions are installed, The Linkerd binary tries to invoke check --output json on the extension binaries. It is important that the extension binaries implement it. For more information, See Extension developer docs

Example error:

  1. invalid extension check output from \"jaeger\" (JSON object expected)

Make sure that the extension binary implements check --output json which returns the healthchecks in the expected json format.

Example error:

  1. × Linkerd command jaeger exists

Make sure that relevant binary exists in $PATH.

For more information about Linkerd extensions. See Extension developer docs

The “linkerd-cni-plugin” checks

These checks run if Linkerd has been installed with the --linkerd-cni-enabled flag. Alternatively they can be run as part of the pre-checks by providing the --linkerd-cni-enabled flag. Most of these checks verify that the required resources are in place. If any of them are missing, you can use linkerd install-cni | kubectl apply -f - to re-install them.

√ cni plugin ConfigMap exists

Example error:

  1. × cni plugin ConfigMap exists
  2. configmaps "linkerd-cni-config" not found
  3. see https://linkerd.io/2/checks/#cni-plugin-cm-exists for hints

Ensure that the linkerd-cni-config ConfigMap exists in the CNI namespace:

  1. $ kubectl get cm linkerd-cni-config -n linkerd-cni
  2. NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP READONLYROOTFS VOLUMES
  3. linkerd-linkerd-cni-cni false RunAsAny RunAsAny RunAsAny RunAsAny false hostPath,secret

Also ensure you have permission to create ConfigMaps:

  1. $ kubectl auth can-i create ConfigMaps
  2. yes

√ cni plugin ClusterRole exist

Example error:

  1. × cni plugin ClusterRole exists
  2. missing ClusterRole: linkerd-cni
  3. see https://linkerd.io/2/checks/#cni-plugin-cr-exists for hints

Ensure that the cluster role exists:

  1. $ kubectl get clusterrole linkerd-cni
  2. NAME AGE
  3. linkerd-cni 54m

Also ensure you have permission to create ClusterRoles:

  1. $ kubectl auth can-i create ClusterRoles
  2. yes

√ cni plugin ClusterRoleBinding exist

Example error:

  1. × cni plugin ClusterRoleBinding exists
  2. missing ClusterRoleBinding: linkerd-cni
  3. see https://linkerd.io/2/checks/#cni-plugin-crb-exists for hints

Ensure that the cluster role binding exists:

  1. $ kubectl get clusterrolebinding linkerd-cni
  2. NAME AGE
  3. linkerd-cni 54m

Also ensure you have permission to create ClusterRoleBindings:

  1. $ kubectl auth can-i create ClusterRoleBindings
  2. yes

√ cni plugin ServiceAccount exists

Example error:

  1. × cni plugin ServiceAccount exists
  2. missing ServiceAccount: linkerd-cni
  3. see https://linkerd.io/2/checks/#cni-plugin-sa-exists for hints

Ensure that the CNI service account exists in the CNI namespace:

  1. $ kubectl get ServiceAccount linkerd-cni -n linkerd-cni
  2. NAME SECRETS AGE
  3. linkerd-cni 1 45m

Also ensure you have permission to create ServiceAccount:

  1. $ kubectl auth can-i create ServiceAccounts -n linkerd-cni
  2. yes

√ cni plugin DaemonSet exists

Example error:

  1. × cni plugin DaemonSet exists
  2. missing DaemonSet: linkerd-cni
  3. see https://linkerd.io/2/checks/#cni-plugin-ds-exists for hints

Ensure that the CNI daemonset exists in the CNI namespace:

  1. $ kubectl get ds -n linkerd-cni
  2. NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
  3. linkerd-cni 1 1 1 1 1 beta.kubernetes.io/os=linux 14m

Also ensure you have permission to create DaemonSets:

  1. $ kubectl auth can-i create DaemonSets -n linkerd-cni
  2. yes

√ cni plugin pod is running on all nodes

Example failure:

  1. cni plugin pod is running on all nodes
  2. number ready: 2, number scheduled: 3
  3. see https://linkerd.io/2/checks/#cni-plugin-ready

Ensure that all the CNI pods are running:

  1. $ kubectl get po -n linkerd-cn
  2. NAME READY STATUS RESTARTS AGE
  3. linkerd-cni-rzp2q 1/1 Running 0 9m20s
  4. linkerd-cni-mf564 1/1 Running 0 9m22s
  5. linkerd-cni-p5670 1/1 Running 0 9m25s

Ensure that all pods have finished the deployment of the CNI config and binary:

  1. $ kubectl logs linkerd-cni-rzp2q -n linkerd-cni
  2. Wrote linkerd CNI binaries to /host/opt/cni/bin
  3. Created CNI config /host/etc/cni/net.d/10-kindnet.conflist
  4. Done configuring CNI. Sleep=true

The “linkerd-multicluster checks

These checks run if the service mirroring controller has been installed. Additionally they can be ran with linkerd multicluster check. Most of these checks verify that the service mirroring controllers are working correctly along with remote gateways. Furthermore the checks ensure that end to end TLS is possible between paired clusters.

Example error:

  1. × Link CRD exists
  2. multicluster.linkerd.io/Link CRD is missing
  3. see https://linkerd.io/2/checks/#l5d-multicluster-link-crd-exists for hints

Make sure multicluster extension is correctly installed and that the links.multicluster.linkerd.io CRD is present.

  1. $ kubectl get crds | grep multicluster
  2. NAME CREATED AT
  3. links.multicluster.linkerd.io 2021-03-10T09:58:10Z

Example error:

  1. × Link resources are valid
  2. failed to parse Link east
  3. see https://linkerd.io/2/checks/#l5d-multicluster-links-are-valid for hints

Make sure all the link objects are specified in the expected format.

√ remote cluster access credentials are valid

Example error:

  1. × remote cluster access credentials are valid
  2. * secret [east/east-config]: could not find east-config secret
  3. see https://linkerd.io/2/checks/#l5d-smc-target-clusters-access for hints

Make sure the relevant Kube-config with relevant permissions. for the specific target cluster is present as a secret correctly

√ clusters share trust anchors

Example errors:

  1. × clusters share trust anchors
  2. Problematic clusters:
  3. * remote
  4. see https://linkerd.io/2/checks/#l5d-multicluster-clusters-share-anchors for hints

The error above indicates that your trust anchors are not compatible. In order to fix that you need to ensure that both your anchors contain identical sets of certificates.

  1. × clusters share trust anchors
  2. Problematic clusters:
  3. * remote: cannot parse trust anchors
  4. see https://linkerd.io/2/checks/#l5d-multicluster-clusters-share-anchors for hints

Such an error indicates that there is a problem with your anchors on the cluster named remote You need to make sure the identity config aspect of your Linkerd installation on the remote cluster is ok. You can run check against the remote cluster to verify that:

  1. linkerd --context=remote check

√ service mirror controller has required permissions

Example error:

  1. × service mirror controller has required permissions
  2. missing Service mirror ClusterRole linkerd-service-mirror-access-local-resources: unexpected verbs expected create,delete,get,list,update,watch, got create,delete,get,update,watch
  3. see https://linkerd.io/2/checks/#l5d-multicluster-source-rbac-correct for hints

This error indicates that the local RBAC permissions of the service mirror service account are not correct. In order to ensure that you have the correct verbs and resources you can inspect your ClusterRole and Role object and look at the rules section.

Expected rules for linkerd-service-mirror-access-local-resources cluster role:

  1. $ kubectl --context=local get clusterrole linkerd-service-mirror-access-local-resources -o yaml
  2. kind: ClusterRole
  3. metadata:
  4. labels:
  5. linkerd.io/control-plane-component: linkerd-service-mirror
  6. name: linkerd-service-mirror-access-local-resources
  7. rules:
  8. - apiGroups:
  9. - ""
  10. resources:
  11. - endpoints
  12. - services
  13. verbs:
  14. - list
  15. - get
  16. - watch
  17. - create
  18. - delete
  19. - update
  20. - apiGroups:
  21. - ""
  22. resources:
  23. - namespaces
  24. verbs:
  25. - create
  26. - list
  27. - get
  28. - watch

Expected rules for linkerd-service-mirror-read-remote-creds role:

  1. $ kubectl --context=local get role linkerd-service-mirror-read-remote-creds -n linkerd-multicluster -o yaml
  2. kind: Role
  3. metadata:
  4. labels:
  5. linkerd.io/control-plane-component: linkerd-service-mirror
  6. name: linkerd-service-mirror-read-remote-creds
  7. namespace: linkerd-multicluster
  8. rules:
  9. - apiGroups:
  10. - ""
  11. resources:
  12. - secrets
  13. verbs:
  14. - list
  15. - get
  16. - watch

√ service mirror controllers are running

Example error:

  1. × service mirror controllers are running
  2. Service mirror controller is not present
  3. see https://linkerd.io/2/checks/#l5d-multicluster-service-mirror-running for hints

Note, it takes a little bit for pods to be scheduled, images to be pulled and everything to start up. If this is a permanent error, you’ll want to validate the state of the controller pod with:

  1. $ kubectl --all-namespaces get po --selector linkerd.io/control-plane-component=linkerd-service-mirror
  2. NAME READY STATUS RESTARTS AGE
  3. linkerd-service-mirror-7bb8ff5967-zg265 2/2 Running 0 50m

√ all gateway mirrors are healthy

Example errors:

  1. all gateway mirrors are healthy
  2. Some gateway mirrors do not have endpoints:
  3. linkerd-gateway-gke.linkerd-multicluster mirrored from cluster [gke]
  4. see https://linkerd.io/2/checks/#l5d-multicluster-gateways-endpoints for hints

The error above indicates that some gateway mirror services in the source cluster do not have associated endpoints resources. These endpoints are created by the Linkerd service mirror controller on the source cluster whenever a link is established with a target cluster.

Such an error indicates that there could be a problem with the creation of the resources by the service mirror controller or the external IP of the gateway service in target cluster.

√ all mirror services have endpoints

Example errors:

  1. all mirror services have endpoints
  2. Some mirror services do not have endpoints:
  3. voting-svc-gke.emojivoto mirrored from cluster [gke] (gateway: [linkerd-multicluster/linkerd-gateway])
  4. see https://linkerd.io/2/checks/#l5d-multicluster-services-endpoints for hints

The error above indicates that some mirror services in the source cluster do not have associated endpoints resources. These endpoints are created by the Linkerd service mirror controller when creating a mirror service with endpoints values as the remote gateway’s external IP.

Such an error indicates that there could be a problem with the creation of the mirror resources by the service mirror controller or the mirror gateway service in the source cluster or the external IP of the gateway service in target cluster.

Example errors:

  1. all mirror services are part of a Link
  2. mirror service voting-east.emojivoto is not part of any Link
  3. see https://linkerd.io/2/checks/#l5d-multicluster-orphaned-services for hints

The error above indicates that some mirror services in the source cluster do not have associated link. These mirror services are created by the Linkerd service mirror controller when a remote service is marked to be mirrored.

Make sure services are marked to be mirrored correctly at remote, and delete if there are any unnecessary ones.

√ multicluster extension proxies are healthy

This error indicates that the proxies running in the multicluster extension are not healthy. Ensure that linkerd-multicluster has been installed with all of the correct setting or re-install as necessary.

√ multicluster extension proxies are up-to-date

This warning indicates the proxies running in the multicluster extension are running an old version. We recommend downloading the latest linkerd-multicluster and upgrading.

√ multicluster extension proxies and cli versions match

This warning indicates that the proxies running in the multicluster extension are running a different version from the Linkerd CLI. We recommend keeping this versions in sync by updating either the CLI or linkerd-multicluster as necessary.

The “linkerd-viz” checks

These checks only run when the linkerd-viz extension is installed. This check is intended to verify the installation of linkerd-viz extension which comprises of tap, web, metrics-api and optional grafana and prometheus instances along with tap-injector which injects the specific tap configuration to the proxies.

√ linkerd-viz Namespace exists

This is the basic check used to verify if the linkerd-viz extension namespace is installed or not. The extension can be installed by running the following command:

  1. linkerd viz install | kubectl apply -f -

The installation can be configured by using the --set, --values, --set-string and --set-file flags. See Linkerd Viz Readme for a full list of configurable fields.

√ linkerd-viz ClusterRoles exist

Example failure:

  1. × linkerd-viz ClusterRoles exist
  2. missing ClusterRoles: linkerd-linkerd-viz-metrics-api
  3. see https://linkerd.io/2/checks/#l5d-viz-cr-exists for hints

Ensure the linkerd-viz extension ClusterRoles exist:

  1. $ kubectl get clusterroles | grep linkerd-viz
  2. linkerd-linkerd-viz-metrics-api 2021-01-26T18:02:17Z
  3. linkerd-linkerd-viz-prometheus 2021-01-26T18:02:17Z
  4. linkerd-linkerd-viz-tap 2021-01-26T18:02:17Z
  5. linkerd-linkerd-viz-tap-admin 2021-01-26T18:02:17Z
  6. linkerd-linkerd-viz-web-check 2021-01-26T18:02:18Z

Also ensure you have permission to create ClusterRoles:

  1. $ kubectl auth can-i create clusterroles
  2. yes

√ linkerd-viz ClusterRoleBindings exist

Example failure:

  1. × linkerd-viz ClusterRoleBindings exist
  2. missing ClusterRoleBindings: linkerd-linkerd-viz-metrics-api
  3. see https://linkerd.io/2/checks/#l5d-viz-crb-exists for hints

Ensure the linkerd-viz extension ClusterRoleBindings exist:

  1. $ kubectl get clusterrolebindings | grep linkerd-viz
  2. linkerd-linkerd-viz-metrics-api ClusterRole/linkerd-linkerd-viz-metrics-api 18h
  3. linkerd-linkerd-viz-prometheus ClusterRole/linkerd-linkerd-viz-prometheus 18h
  4. linkerd-linkerd-viz-tap ClusterRole/linkerd-linkerd-viz-tap 18h
  5. linkerd-linkerd-viz-tap-auth-delegator ClusterRole/system:auth-delegator 18h
  6. linkerd-linkerd-viz-web-admin ClusterRole/linkerd-linkerd-viz-tap-admin 18h
  7. linkerd-linkerd-viz-web-check ClusterRole/linkerd-linkerd-viz-web-check 18h

Also ensure you have permission to create ClusterRoleBindings:

  1. $ kubectl auth can-i create clusterrolebindings
  2. yes

√ viz extension proxies are healthy

This error indicates that the proxies running in the viz extension are not healthy. Ensure that linkerd-viz has been installed with all of the correct setting or re-install as necessary.

√ viz extension proxies are up-to-date

This warning indicates the proxies running in the viz extension are running an old version. We recommend downloading the latest linkerd-viz and upgrading.

√ viz extension proxies and cli versions match

This warning indicates that the proxies running in the viz extension are running a different version from the Linkerd CLI. We recommend keeping this versions in sync by updating either the CLI or linkerd-viz as necessary.

√ tap API server has valid cert

Example failure:

  1. × tap API server has valid cert
  2. secrets "tap-k8s-tls" not found
  3. see https://linkerd.io/2/checks/#l5d-tap-cert-valid for hints

Ensure that the tap-k8s-tls secret exists and contains the appropriate tls.crt and tls.key data entries. For versions before 2.9, the secret is named linkerd-tap-tls and it should contain the crt.pem and key.pem data entries.

  1. × tap API server has valid cert
  2. cert is not issued by the trust anchor: x509: certificate is valid for xxxxxx, not tap.linkerd-viz.svc
  3. see https://linkerd.io/2/checks/#l5d-tap-cert-valid for hints

Here you need to make sure the certificate was issued specifically for tap.linkerd-viz.svc.

√ tap API server cert is valid for at least 60 days

Example failure:

  1. tap API server cert is valid for at least 60 days
  2. certificate will expire on 2020-11-07T17:00:07Z
  3. see https://linkerd.io/2/checks/#l5d-webhook-cert-not-expiring-soon for hints

This warning indicates that the expiry of the tap API Server webhook cert is approaching. In order to address this problem without incurring downtime, you can follow the process outlined in Automatically Rotating your webhook TLS Credentials.

√ tap api service is running

Example failure:

  1. × FailedDiscoveryCheck: no response from https://10.233.31.133:443: Get https://10.233.31.133:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

tap uses the kubernetes Aggregated Api-Server model to allow users to have k8s RBAC on top. This model has the following specific requirements in the cluster:

√ linkerd-viz pods are injected

  1. × linkerd-viz extension pods are injected
  2. could not find proxy container for tap-59f5595fc7-ttndp pod
  3. see https://linkerd.io/2/checks/#l5d-viz-pods-injection for hints

Ensure all the linkerd-viz pods are injected

  1. $ kubectl -n linkerd-viz get pods
  2. NAME READY STATUS RESTARTS AGE
  3. grafana-68cddd7cc8-nrv4h 2/2 Running 3 18h
  4. metrics-api-77f684f7c7-hnw8r 2/2 Running 2 18h
  5. prometheus-5f6898ff8b-s6rjc 2/2 Running 2 18h
  6. tap-59f5595fc7-ttndp 2/2 Running 2 18h
  7. web-78d6588d4-pn299 2/2 Running 2 18h
  8. tap-injector-566f7ff8df-vpcwc 2/2 Running 2 18h

Make sure that the proxy-injector is working correctly by running linkerd check

√ viz extension pods are running

  1. × viz extension pods are running
  2. container linkerd-proxy in pod tap-59f5595fc7-ttndp is not ready
  3. see https://linkerd.io/2/checks/#l5d-viz-pods-running for hints

Ensure all the linkerd-viz pods are running with 2/2

  1. $ kubectl -n linkerd-viz get pods
  2. NAME READY STATUS RESTARTS AGE
  3. grafana-68cddd7cc8-nrv4h 2/2 Running 3 18h
  4. metrics-api-77f684f7c7-hnw8r 2/2 Running 2 18h
  5. prometheus-5f6898ff8b-s6rjc 2/2 Running 2 18h
  6. tap-59f5595fc7-ttndp 2/2 Running 2 18h
  7. web-78d6588d4-pn299 2/2 Running 2 18h
  8. tap-injector-566f7ff8df-vpcwc 2/2 Running 2 18h

Make sure that the proxy-injector is working correctly by running linkerd check

√ prometheus is installed and configured correctly

  1. × prometheus is installed and configured correctly
  2. missing ClusterRoles: linkerd-linkerd-viz-prometheus
  3. see https://linkerd.io/2/checks/#l5d-viz-cr-exists for hints

Ensure all the prometheus related resources are present and running correctly.

  1. kubectl -n linkerd-viz get deploy,cm | grep prometheus
  2. deployment.apps/prometheus 1/1 1 1 3m18s
  3. configmap/prometheus-config 1 3m18s
  4. kubectl get clusterRoleBindings | grep prometheus
  5. linkerd-linkerd-viz-prometheus ClusterRole/linkerd-linkerd-viz-prometheus 3m37s
  6. kubectl get clusterRoles | grep prometheus
  7. linkerd-linkerd-viz-prometheus 2021-02-26T06:03:11Zh

√ can initialize the client

Example failure:

  1. × can initialize the client
  2. Failed to get deploy for pod metrics-api-77f684f7c7-hnw8r: not running

Verify that the metrics API pod is running correctly

  1. kubectl -n linkerd-viz get pods
  2. NAME READY STATUS RESTARTS AGE
  3. metrics-api-7bb8cb8489-cbq4m 2/2 Running 0 4m58s
  4. tap-injector-6b9bc6fc4-cgbr4 2/2 Running 0 4m56s
  5. tap-5f6ddcc684-k2fd6 2/2 Running 0 4m57s
  6. web-cbb846484-d987n 2/2 Running 0 4m56s
  7. grafana-76fd8765f4-9rg8q 2/2 Running 0 4m58s
  8. prometheus-7c5c48c466-jc27g 2/2 Running 0 4m58s

√ viz extension self-check

Example failure:

  1. × viz extension self-check
  2. No results returned

Check the logs on the viz extensions’s metrics API:

  1. kubectl -n linkerd-viz logs deploy/metrics-api metrics-api

√ prometheus is authorized to scrape data plane pods

Example failure:

  1. prometheus is authorized to scrape data plane pods
  2. prometheus may not be authorized to scrape the following pods:
  3. * emojivoto/voting-5f46cbcdc6-p5dhn
  4. * emojivoto/emoji-54f8786975-6qc8s
  5. * emojivoto/vote-bot-85dfbf8996-86c44
  6. * emojivoto/web-79db6f4548-4mzkg
  7. consider running `linkerd viz allow-scrapes` to authorize prometheus scrapes
  8. see https://linkerd.io/2/checks/#l5d-viz-data-plane-prom-authz for hints

This warning indicates that the listed pods have the deny default inbound policy, which may prevent the linkerd-viz Prometheus instance from scraping the data plane proxies in those pods. If Prometheus cannot scrape a data plane pod, linkerd viz commands targeting that pod will return no data.

This may be resolved by running the linkerd viz allow-scrapes command, which generates policy resources authorizing Prometheus to scrape the data plane proxies in a namespace:

  1. linkerd viz allow-scrapes --namespace emojivoto | kubectl apply -f -

Note that this warning only checks for the existence of the policy resources generated by linkerd viz allow-scrapes in namespaces that contain pods with the deny default inbound policy. In some cases, Prometheus scrapes may also be authorized by other, user-generated authorization policies. If metrics from the listed pods are present in Prometheus, this warning is a false positive and can be safely disregarded.

√ data plane proxy metrics are present in Prometheus

Example failure:

  1. × data plane proxy metrics are present in Prometheus
  2. Data plane metrics not found for linkerd/linkerd-identity-b8c4c48c8-pflc9.

Ensure Prometheus can connect to each linkerd-proxy via the Prometheus dashboard:

  1. kubectl -n linkerd-viz port-forward svc/prometheus 9090

…and then browse to http://localhost:9090/targets, validate the linkerd-proxy section.

You should see all your pods here. If they are not:

  • Prometheus might be experiencing connectivity issues with the k8s api server. Check out the logs and delete the pod to flush any possible transient errors.

The “linkerd-jaeger” checks

These checks only run when the linkerd-jaeger extension is installed. This check is intended to verify the installation of linkerd-jaeger extension which comprises of open-census collector and jaeger components along with jaeger-injector which injects the specific trace configuration to the proxies.

√ linkerd-jaeger extension Namespace exists

This is the basic check used to verify if the linkerd-jaeger extension namespace is installed or not. The extension can be installed by running the following command

  1. linkerd jaeger install | kubectl apply -f -

The installation can be configured by using the --set, --values, --set-string and --set-file flags. See Linkerd Jaeger Readme for a full list of configurable fields.

√ jaeger extension proxies are healthy

This error indicates that the proxies running in the jaeger extension are not healthy. Ensure that linkerd-jaeger has been installed with all of the correct setting or re-install as necessary.

√ jaeger extension proxies are up-to-date

This warning indicates the proxies running in the jaeger extension are running an old version. We recommend downloading the latest linkerd-jaeger and upgrading.

√ jaeger extension proxies and cli versions match

This warning indicates that the proxies running in the jaeger extension are running a different version from the Linkerd CLI. We recommend keeping this versions in sync by updating either the CLI or linkerd-jaeger as necessary.

√ jaeger extension pods are injected

  1. × jaeger extension pods are injected
  2. could not find proxy container for jaeger-6f98d5c979-scqlq pod
  3. see https://linkerd.io/2/checks/#l5d-jaeger-pods-injections for hints

Ensure all the jaeger pods are injected

  1. $ kubectl -n linkerd-jaeger get pods
  2. NAME READY STATUS RESTARTS AGE
  3. collector-69cc44dfbc-rhpfg 2/2 Running 0 11s
  4. jaeger-6f98d5c979-scqlq 2/2 Running 0 11s
  5. jaeger-injector-6c594f5577-cz75h 2/2 Running 0 10s

Make sure that the proxy-injector is working correctly by running linkerd check

√ jaeger extension pods are running

  1. × jaeger extension pods are running
  2. container linkerd-proxy in pod jaeger-59f5595fc7-ttndp is not ready
  3. see https://linkerd.io/2/checks/#l5d-jaeger-pods-running for hints

Ensure all the linkerd-jaeger pods are running with 2/2

  1. $ kubectl -n linkerd-jaeger get pods
  2. NAME READY STATUS RESTARTS AGE
  3. jaeger-injector-548684d74b-bcq5h 2/2 Running 0 5s
  4. collector-69cc44dfbc-wqf6s 2/2 Running 0 5s
  5. jaeger-6f98d5c979-vs622 2/2 Running 0 5sh

Make sure that the proxy-injector is working correctly by running linkerd check

The “linkerd-buoyant” checks

These checks only run when the linkerd-buoyant extension is installed. This check is intended to verify the installation of linkerd-buoyant extension which comprises linkerd-buoyant CLI, the buoyant-cloud-agent Deployment, and the buoyant-cloud-metrics DaemonSet.

√ Linkerd extension command linkerd-buoyant exists

  1. Linkerd extension command linkerd-buoyant exists
  2. exec: "linkerd-buoyant": executable file not found in $PATH
  3. see https://linkerd.io/2/checks/#extensions for hints

Ensure you have the linkerd-buoyant cli installed:

  1. linkerd-buoyant check

To install the CLI:

  1. curl https://buoyant.cloud/install | sh

√ linkerd-buoyant can determine the latest version

  1. linkerd-buoyant can determine the latest version
  2. Get "https://buoyant.cloud/version.json": dial tcp: lookup buoyant.cloud: no such host
  3. see https://linkerd.io/checks#l5d-buoyant for hints

Ensure you can connect to the Linkerd Buoyant version check endpoint from the environment the linkerd cli is running:

  1. $ curl https://buoyant.cloud/version.json
  2. {"linkerd-buoyant":"v0.4.4"}

√ linkerd-buoyant cli is up-to-date

  1. linkerd-buoyant cli is up-to-date
  2. CLI version is v0.4.3 but the latest is v0.4.4
  3. see https://linkerd.io/checks#l5d-buoyant for hints

To update to the latest version of the linkerd-buoyant CLI:

  1. curl https://buoyant.cloud/install | sh

√ buoyant-cloud Namespace exists

  1. × buoyant-cloud Namespace exists
  2. namespaces "buoyant-cloud" not found
  3. see https://linkerd.io/checks#l5d-buoyant for hints

Ensure the buoyant-cloud namespace exists:

  1. kubectl get ns/buoyant-cloud

If the namespace does not exist, the linkerd-buoyant installation may be missing or incomplete. To install the extension:

  1. linkerd-buoyant install | kubectl apply -f -

√ buoyant-cloud Namespace has correct labels

  1. × buoyant-cloud Namespace has correct labels
  2. missing app.kubernetes.io/part-of label
  3. see https://linkerd.io/checks#l5d-buoyant for hints

The linkerd-buoyant installation may be missing or incomplete. To install the extension:

  1. linkerd-buoyant install | kubectl apply -f -

√ buoyant-cloud-agent ClusterRole exists

  1. × buoyant-cloud-agent ClusterRole exists
  2. missing ClusterRole: buoyant-cloud-agent
  3. see https://linkerd.io/checks#l5d-buoyant for hints

Ensure that the cluster role exists:

  1. $ kubectl get clusterrole buoyant-cloud-agent
  2. NAME CREATED AT
  3. buoyant-cloud-agent 2020-11-13T00:59:50Z

Also ensure you have permission to create ClusterRoles:

  1. $ kubectl auth can-i create ClusterRoles
  2. yes

√ buoyant-cloud-agent ClusterRoleBinding exists

  1. × buoyant-cloud-agent ClusterRoleBinding exists
  2. missing ClusterRoleBinding: buoyant-cloud-agent
  3. see https://linkerd.io/checks#l5d-buoyant for hints

Ensure that the cluster role binding exists:

  1. $ kubectl get clusterrolebinding buoyant-cloud-agent
  2. NAME ROLE AGE
  3. buoyant-cloud-agent ClusterRole/buoyant-cloud-agent 301d

Also ensure you have permission to create ClusterRoleBindings:

  1. $ kubectl auth can-i create ClusterRoleBindings
  2. yes

√ buoyant-cloud-agent ServiceAccount exists

  1. × buoyant-cloud-agent ServiceAccount exists
  2. missing ServiceAccount: buoyant-cloud-agent
  3. see https://linkerd.io/checks#l5d-buoyant for hints

Ensure that the service account exists:

  1. $ kubectl -n buoyant-cloud get serviceaccount buoyant-cloud-agent
  2. NAME SECRETS AGE
  3. buoyant-cloud-agent 1 301d

Also ensure you have permission to create ServiceAccounts:

  1. $ kubectl -n buoyant-cloud auth can-i create ServiceAccount
  2. yes

√ buoyant-cloud-id Secret exists

  1. × buoyant-cloud-id Secret exists
  2. missing Secret: buoyant-cloud-id
  3. see https://linkerd.io/checks#l5d-buoyant for hints

Ensure that the secret exists:

  1. $ kubectl -n buoyant-cloud get secret buoyant-cloud-id
  2. NAME TYPE DATA AGE
  3. buoyant-cloud-id Opaque 4 301d

Also ensure you have permission to create ServiceAccounts:

  1. $ kubectl -n buoyant-cloud auth can-i create ServiceAccount
  2. yes

√ buoyant-cloud-agent Deployment exists

  1. × buoyant-cloud-agent Deployment exists
  2. deployments.apps "buoyant-cloud-agent" not found
  3. see https://linkerd.io/checks#l5d-buoyant for hints

Ensure the buoyant-cloud-agent Deployment exists:

  1. kubectl -n buoyant-cloud get deploy/buoyant-cloud-agent

If the Deployment does not exist, the linkerd-buoyant installation may be missing or incomplete. To reinstall the extension:

  1. linkerd-buoyant install | kubectl apply -f -

√ buoyant-cloud-agent Deployment is running

  1. × buoyant-cloud-agent Deployment is running
  2. no running pods for buoyant-cloud-agent Deployment
  3. see https://linkerd.io/checks#l5d-buoyant for hints

Note, it takes a little bit for pods to be scheduled, images to be pulled and everything to start up. If this is a permanent error, you’ll want to validate the state of the buoyant-cloud-agent Deployment with:

  1. $ kubectl -n buoyant-cloud get po --selector app=buoyant-cloud-agent
  2. NAME READY STATUS RESTARTS AGE
  3. buoyant-cloud-agent-6b8c6888d7-htr7d 2/2 Running 0 156m

Check the agent’s logs with:

  1. kubectl logs -n buoyant-cloud buoyant-cloud-agent-6b8c6888d7-htr7d buoyant-cloud-agent

√ buoyant-cloud-agent Deployment is injected

  1. × buoyant-cloud-agent Deployment is injected
  2. could not find proxy container for buoyant-cloud-agent-6b8c6888d7-htr7d pod
  3. see https://linkerd.io/checks#l5d-buoyant for hints

Ensure the buoyant-cloud-agent pod is injected, the READY column should show 2/2:

  1. $ kubectl -n buoyant-cloud get pods --selector app=buoyant-cloud-agent
  2. NAME READY STATUS RESTARTS AGE
  3. buoyant-cloud-agent-6b8c6888d7-htr7d 2/2 Running 0 161m

Make sure that the proxy-injector is working correctly by running linkerd check.

√ buoyant-cloud-agent Deployment is up-to-date

  1. buoyant-cloud-agent Deployment is up-to-date
  2. incorrect app.kubernetes.io/version label: v0.4.3, expected: v0.4.4
  3. see https://linkerd.io/checks#l5d-buoyant for hints

Check the version with:

  1. $ linkerd-buoyant version
  2. CLI version: v0.4.4
  3. Agent version: v0.4.4

To update to the latest version:

  1. linkerd-buoyant install | kubectl apply -f -

√ buoyant-cloud-agent Deployment is running a single pod

  1. × buoyant-cloud-agent Deployment is running a single pod
  2. expected 1 buoyant-cloud-agent pod, found 2
  3. see https://linkerd.io/checks#l5d-buoyant for hints

buoyant-cloud-agent should run as a singleton. Check for other pods:

  1. kubectl get po -A --selector app=buoyant-cloud-agent

√ buoyant-cloud-metrics DaemonSet exists

  1. × buoyant-cloud-metrics DaemonSet exists
  2. deployments.apps "buoyant-cloud-metrics" not found
  3. see https://linkerd.io/checks#l5d-buoyant for hints

Ensure the buoyant-cloud-metrics DaemonSet exists:

  1. kubectl -n buoyant-cloud get daemonset/buoyant-cloud-metrics

If the DaemonSet does not exist, the linkerd-buoyant installation may be missing or incomplete. To reinstall the extension:

  1. linkerd-buoyant install | kubectl apply -f -

√ buoyant-cloud-metrics DaemonSet is running

  1. × buoyant-cloud-metrics DaemonSet is running
  2. no running pods for buoyant-cloud-metrics DaemonSet
  3. see https://linkerd.io/checks#l5d-buoyant for hints

Note, it takes a little bit for pods to be scheduled, images to be pulled and everything to start up. If this is a permanent error, you’ll want to validate the state of the buoyant-cloud-metrics DaemonSet with:

  1. $ kubectl -n buoyant-cloud get po --selector app=buoyant-cloud-metrics
  2. NAME READY STATUS RESTARTS AGE
  3. buoyant-cloud-metrics-kt9mv 2/2 Running 0 163m
  4. buoyant-cloud-metrics-q8jhj 2/2 Running 0 163m
  5. buoyant-cloud-metrics-qtflh 2/2 Running 0 164m
  6. buoyant-cloud-metrics-wqs4k 2/2 Running 0 163m

Check the agent’s logs with:

  1. kubectl logs -n buoyant-cloud buoyant-cloud-metrics-kt9mv buoyant-cloud-metrics

√ buoyant-cloud-metrics DaemonSet is injected

  1. × buoyant-cloud-metrics DaemonSet is injected
  2. could not find proxy container for buoyant-cloud-agent-6b8c6888d7-htr7d pod
  3. see https://linkerd.io/checks#l5d-buoyant for hints

Ensure the buoyant-cloud-metrics pods are injected, the READY column should show 2/2:

  1. $ kubectl -n buoyant-cloud get pods --selector app=buoyant-cloud-metrics
  2. NAME READY STATUS RESTARTS AGE
  3. buoyant-cloud-metrics-kt9mv 2/2 Running 0 166m
  4. buoyant-cloud-metrics-q8jhj 2/2 Running 0 166m
  5. buoyant-cloud-metrics-qtflh 2/2 Running 0 166m
  6. buoyant-cloud-metrics-wqs4k 2/2 Running 0 166m

Make sure that the proxy-injector is working correctly by running linkerd check.

√ buoyant-cloud-metrics DaemonSet is up-to-date

  1. buoyant-cloud-metrics DaemonSet is up-to-date
  2. incorrect app.kubernetes.io/version label: v0.4.3, expected: v0.4.4
  3. see https://linkerd.io/checks#l5d-buoyant for hints

Check the version with:

  1. $ kubectl -n buoyant-cloud get daemonset/buoyant-cloud-metrics -o jsonpath='{.metadata.labels}'
  2. {"app.kubernetes.io/name":"metrics","app.kubernetes.io/part-of":"buoyant-cloud","app.kubernetes.io/version":"v0.4.4"}

To update to the latest version:

  1. linkerd-buoyant install | kubectl apply -f -