3 - 故障诊断


Where is everything

Most of the troubleshooting will be done on objects in these 3 namespaces.

  • cattle-system - rancher deployment and pods.
  • ingress-nginx - Ingress controller pods and services.
  • kube-system - tiller and cert-manager pods.

“default backend - 404”

A number of things can cause the ingress-controller not to forward traffic to your rancher instance. Most of the time its due to a bad ssl configuration.

Things to check

Is Rancher Running

Use kubectl to check the cattle-system system namespace and see if the Rancher pods are in a Running state.

  1. kubectl --kubeconfig=kube_configxxx.yml -n cattle-system get pods
  2. NAME READY STATUS RESTARTS AGE
  3. pod/rancher-784d94f59b-vgqzh 1/1 Running 0 10m

If the state is not Running, run a describe on the pod and check the Events.

  1. kubectl --kubeconfig=kube_configxxx.yml -n cattle-system describe pod
  2. ...
  3. Events:
  4. Type Reason Age From Message
  5. ---- ------ ---- ---- -------
  6. Normal Scheduled 11m default-scheduler Successfully assigned rancher-784d94f59b-vgqzh to localhost
  7. Normal SuccessfulMountVolume 11m kubelet, localhost MountVolume.SetUp succeeded for volume "rancher-token-dj4mt"
  8. Normal Pulling 11m kubelet, localhost pulling image "rancher/rancher:v2.0.4"
  9. Normal Pulled 11m kubelet, localhost Successfully pulled image "rancher/rancher:v2.0.4"
  10. Normal Created 11m kubelet, localhost Created container
  11. Normal Started 11m kubelet, localhost Started container

Checking the rancher logs

Use kubectl to list the pods.

  1. kubectl --kubeconfig=kube_configxxx.yml -n cattle-system get pods
  2. NAME READY STATUS RESTARTS AGE
  3. pod/rancher-784d94f59b-vgqzh 1/1 Running 0 10m

Use kubectl and the pod name to list the logs from the pod.

  1. kubectl --kubeconfig=kube_configxxx.yml -n cattle-namespace logs -f rancher-784d94f59b-vgqzh

Cert CN is “Kubernetes Ingress Controller Fake Certificate”

Use your browser to check the certificate details. If it says the Common Name is “Kubernetes Ingress Controller Fake Certificate”, something may have gone wrong with reading or issuing your SSL cert.

Note: if you are using LetsEncrypt to issue certs it can sometimes take a few minuets to issue the cert.

cert-manager issued certs (Rancher Generated or LetsEncrypt)

cert-manager has 3 parts.

  • cert-manager pod in the kube-system namespace.
  • Issuer object in the cattle-system namespace.
  • Certificate object in the cattle-system namespace.Work backwards and do a kubectl —kubeconfig=kube_configxxx.yml describe on each object and check the events. You can track down what might be missing.

For example there is a problem with the Issuer:

  1. kubectl --kubeconfig=kube_configxxx.yml -n cattle-system describe certificate
  2. ...
  3. Events:
  4. Type Reason Age From Message
  5. ---- ------ ---- ---- -------
  6. Warning IssuerNotReady 18s (x23 over 19m) cert-manager Issuer rancher not ready
  1. kubectl --kubeconfig=kube_configxxx.yml -n cattle-system describe issuer
  2. ...
  3. Events:
  4. Type Reason Age From Message
  5. ---- ------ ---- ---- -------
  6. Warning ErrInitIssuer 19m (x12 over 19m) cert-manager Error initializing issuer: secret "tls-rancher" not found
  7. Warning ErrGetKeyPair 9m (x16 over 19m) cert-manager Error getting keypair for CA issuer: secret "tls-rancher" not found

Bring Your Own SSL Certs

Your certs get applied directly to the Ingress object in the cattle-system namespace.

Check the status of the Ingress object and see if its ready.

  1. kubectl --kubeconfig=kube_configxxx.yml -n cattle-system describe ingress

If its ready and the SSL is still not working you may have a malformed cert or secret.

Check the nginx-ingress-controller logs. Because the nginx-ingress-controller has multiple containers in its pod you will need to specify the name of the container.

  1. kubectl --kubeconfig=kube_configxxx.yml -n ingress-nginx logs -f nginx-ingress-controller-rfjrq nginx-ingress-controller
  2. ...
  3. W0705 23:04:58.240571 7 backend_ssl.go:49] error obtaining PEM from secret cattle-system/tls-rancher-ingress: error retrieving secret cattle-system/tls-rancher-ingress: secret cattle-system/tls-rancher-ingress was not found