Monitoring

The following sections contain tips to troubleshoot Harvester Monitoring.

Monitoring is unusable

When the Harvester Dashboard is not showing any monitoring metrics, it can be caused by the following reasons.

Monitoring is unusable due to Pod being stuck in Terminating status

Harvester Monitoring pods are deployed randomly on the cluster Nodes. When the Node hosting the pods accidentally goes down, the related pods may become stuck in the Terminating status rendering the Monitoring unusable from the WebUI.

  1. $ kubectl get pods -n cattle-monitoring-system
  2. NAMESPACE NAME READY STATUS RESTARTS AGE
  3. cattle-monitoring-system prometheus-rancher-monitoring-prometheus-0 3/3 Terminating 0 3d23h
  4. cattle-monitoring-system rancher-monitoring-admission-create-fwjn9 0/1 Terminating 0 137m
  5. cattle-monitoring-system rancher-monitoring-crd-create-9wtzf 0/1 Terminating 0 137m
  6. cattle-monitoring-system rancher-monitoring-grafana-d9c56d79b-ph4nz 3/3 Terminating 0 3d23h
  7. cattle-monitoring-system rancher-monitoring-grafana-d9c56d79b-t24sz 0/3 Init:0/2 0 132m
  8. cattle-monitoring-system rancher-monitoring-kube-state-metrics-5bc8bb48bd-nbd92 1/1 Running 4 4d1h
  9. ...

Monitoring can be recovered using CLI commands to force delete the related pods. The cluster will redeploy new pods to replace them.

  1. # Delete each none-running Pod in namespace cattle-monitoring-system.
  2. $ kubectl delete pod --force -n cattle-monitoring-system prometheus-rancher-monitoring-prometheus-0
  3. pod "prometheus-rancher-monitoring-prometheus-0" force deleted
  4. $ kubectl delete pod --force -n cattle-monitoring-system rancher-monitoring-admission-create-fwjn9
  5. $ kubectl delete pod --force -n cattle-monitoring-system rancher-monitoring-crd-create-9wtzf
  6. $ kubectl delete pod --force -n cattle-monitoring-system rancher-monitoring-grafana-d9c56d79b-ph4nz
  7. $ kubectl delete pod --force -n cattle-monitoring-system rancher-monitoring-grafana-d9c56d79b-t24sz

Wait for a few minutes so that the new pods are created and readied for the Monitoring dashboard to be usable again.

  1. $ kubectl get pods -n cattle-monitoring-system
  2. NAME READY STATUS RESTARTS AGE
  3. prometheus-rancher-monitoring-prometheus-0 0/3 Init:0/1 0 98s
  4. rancher-monitoring-grafana-d9c56d79b-cp86w 0/3 Init:0/2 0 27s
  5. ...
  6. $ kubectl get pods -n cattle-monitoring-system
  7. NAME READY STATUS RESTARTS AGE
  8. prometheus-rancher-monitoring-prometheus-0 3/3 Running 0 7m57s
  9. rancher-monitoring-grafana-d9c56d79b-cp86w 3/3 Running 0 6m46s
  10. ...

Expand PV/Volume Size

Harvester integrates Longhorn as the default storage provider.

Harvester Monitoring uses Persistent Volume (PV) to store running data. When a cluster has been running for a certain time, the Persistent Volume may need to expand its size.

Based on the Longhorn Volume expansion guide, Harvester illustrates how to expand the volume size.

View Volume

From Embedded Longhorn WebUI

Access the embedded Longhorn WebUI according to this document.

The Longhorn dashboard default view.

Monitoring - 图1

Click Volume to list all existing volumes.

Monitoring - 图2

From CLI

You can also use kubectl to get all Volumes.

  1. # kubectl get pvc -A
  2. NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
  3. cattle-monitoring-system alertmanager-rancher-monitoring-alertmanager-db-alertmanager-rancher-monitoring-alertmanager-0 Bound pvc-1b2fbbe9-14b1-4a65-941a-7d5645a89977 5Gi RWO harvester-longhorn 43h
  4. cattle-monitoring-system prometheus-rancher-monitoring-prometheus-db-prometheus-rancher-monitoring-prometheus-0 Bound pvc-7c6dcb61-51a9-4a38-b4c5-acaa11788978 50Gi RWO harvester-longhorn 43h
  5. cattle-monitoring-system rancher-monitoring-grafana Bound pvc-b2b2c07c-f7cd-4965-90e6-ac3319597bf7 2Gi RWO harvester-longhorn 43h
  6. # kubectl get volume -A
  7. NAMESPACE NAME STATE ROBUSTNESS SCHEDULED SIZE NODE AGE
  8. longhorn-system pvc-1b2fbbe9-14b1-4a65-941a-7d5645a89977 attached degraded 5368709120 harv31 43h
  9. longhorn-system pvc-7c6dcb61-51a9-4a38-b4c5-acaa11788978 attached degraded 53687091200 harv31 43h
  10. longhorn-system pvc-b2b2c07c-f7cd-4965-90e6-ac3319597bf7 attached degraded 2147483648 harv31 43h

Scale Down a Deployment

To detach the Volume, you need to scale down the deployment that uses the Volume.

The example below is against the PVC claimed by rancher-monitoring-grafana.

Find the deployment in the namespace cattle-monitoring-system.

  1. # kubectl get deployment -n cattle-monitoring-system
  2. NAME READY UP-TO-DATE AVAILABLE AGE
  3. rancher-monitoring-grafana 1/1 1 1 43h // target deployment
  4. rancher-monitoring-kube-state-metrics 1/1 1 1 43h
  5. rancher-monitoring-operator 1/1 1 1 43h
  6. rancher-monitoring-prometheus-adapter 1/1 1 1 43h

Scale down the deployment rancher-monitoring-grafana to 0.

  1. # kubectl scale --replicas=0 deployment/rancher-monitoring-grafana -n cattle-monitoring-system

Check the deployment and the volume.

  1. # kubectl get deployment -n cattle-monitoring-system
  2. NAME READY UP-TO-DATE AVAILABLE AGE
  3. rancher-monitoring-grafana 0/0 0 0 43h // scaled down
  4. rancher-monitoring-kube-state-metrics 1/1 1 1 43h
  5. rancher-monitoring-operator 1/1 1 1 43h
  6. rancher-monitoring-prometheus-adapter 1/1 1 1 43h
  7. # kubectl get volume -A
  8. NAMESPACE NAME STATE ROBUSTNESS SCHEDULED SIZE NODE AGE
  9. longhorn-system pvc-1b2fbbe9-14b1-4a65-941a-7d5645a89977 attached degraded 5368709120 harv31 43h
  10. longhorn-system pvc-7c6dcb61-51a9-4a38-b4c5-acaa11788978 attached degraded 53687091200 harv31 43h
  11. longhorn-system pvc-b2b2c07c-f7cd-4965-90e6-ac3319597bf7 detached unknown 2147483648 43h // volume is detached

Expand Volume

In the Longhorn WebUI, the related volume becomes Detached. Click the icon in the Operation column, and select Expand Volume.

Monitoring - 图3

Input a new size, and Longhorn will expand the volume to this size.

Monitoring - 图4

Scale Up a Deployment

After the Volume is expanded to target size, you need to scale up the aforementioned deployment to its original replicas. For the above example of rancher-monitoring-grafana, the original replicas is 1.

  1. # kubectl scale --replicas=1 deployment/rancher-monitoring-grafana -n cattle-monitoring-system

Check the deployment again.

  1. # kubectl get deployment -n cattle-monitoring-system
  2. NAME READY UP-TO-DATE AVAILABLE AGE
  3. rancher-monitoring-grafana 1/1 1 1 43h // scaled up
  4. rancher-monitoring-kube-state-metrics 1/1 1 1 43h
  5. rancher-monitoring-operator 1/1 1 1 43h
  6. rancher-monitoring-prometheus-adapter 1/1 1 1 43h

The Volume is attached to the new POD.

Monitoring - 图5

To now, the Volume is expanded to the new size and the POD is using it smoothly.

Fail to Enable rancher-monitoring Addon

You may encounter this when you install the Harvester v1.3.0 or higher version cluster with the minimal 250 GB disk per hardware requirements.

Reproduce Steps

  1. Install the Harvester v1.3.0 cluster.

  2. Enable the rancher-monitoring addon, you will observe:

  • The POD prometheus-rancher-monitoring-prometheus-0 in cattle-monitoring-system namespace fails to start due to PVC attached failed.

    1. $ kubectl get pods -n cattle-monitoring-system
    2. NAME READY STATUS RESTARTS AGE
    3. alertmanager-rancher-monitoring-alertmanager-0 2/2 Running 0 3m22s
    4. helm-install-rancher-monitoring-4b5mx 0/1 Completed 0 3m41s
    5. prometheus-rancher-monitoring-prometheus-0 0/3 Init:0/1 0 3m21s // stuck in this status
    6. rancher-monitoring-grafana-d6f466988-hgpkb 4/4 Running 0 3m26s
    7. rancher-monitoring-kube-state-metrics-7659b76cc4-66sr7 1/1 Running 0 3m26s
    8. rancher-monitoring-operator-595476bc84-7hdxj 1/1 Running 0 3m25s
    9. rancher-monitoring-prometheus-adapter-55dc9ccd5d-pcrpk 1/1 Running 0 3m26s
    10. rancher-monitoring-prometheus-node-exporter-pbzv4 1/1 Running 0 3m26s
    11. $ kubectl describe pod -n cattle-monitoring-system prometheus-rancher-monitoring-prometheus-0
    12. Name: prometheus-rancher-monitoring-prometheus-0
    13. Namespace: cattle-monitoring-system
    14. Priority: 0
    15. Service Account: rancher-monitoring-prometheus
    16. ...
    17. Events:
    18. Type Reason Age From Message
    19. ---- ------ ---- ---- -------
    20. Warning FailedScheduling 3m48s (x3 over 4m15s) default-scheduler 0/1 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..
    21. Normal Scheduled 3m44s default-scheduler Successfully assigned cattle-monitoring-system/prometheus-rancher-monitoring-prometheus-0 to harv41
    22. Warning FailedMount 101s kubelet Unable to attach or mount volumes: unmounted volumes=[prometheus-rancher-monitoring-prometheus-db], unattached volumes=[prometheus-rancher-monitoring-prometheus-db], failed to process volumes=[]: timed out waiting for the condition
    23. Warning FailedAttachVolume 90s (x9 over 3m42s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-bbe8760d-926c-484a-851c-b8ec29ae05c0" : rpc error: code = Aborted desc = volume pvc-bbe8760d-926c-484a-851c-b8ec29ae05c0 is not ready for workloads
    24. $ kubectl get pvc -A
    25. NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
    26. cattle-monitoring-system prometheus-rancher-monitoring-prometheus-db-prometheus-rancher-monitoring-prometheus-0 Bound pvc-bbe8760d-926c-484a-851c-b8ec29ae05c0 50Gi RWO harvester-longhorn 7m12s
    27. $ kubectl get volume -A
    28. NAMESPACE NAME DATA ENGINE STATE ROBUSTNESS SCHEDULED SIZE NODE AGE
    29. longhorn-system pvc-bbe8760d-926c-484a-851c-b8ec29ae05c0 v1 detached unknown 53687091200 6m55s
  • The Longhorn manager is unable to schedule the replica.

    1. $ kubectl logs -n longhorn-system longhorn-manager-bf65b | grep "pvc-bbe8760d-926c-484a-851c-b8ec29ae05c0"
    2. time="2024-02-19T10:12:56Z" level=error msg="There's no available disk for replica pvc-bbe8760d-926c-484a-851c-b8ec29ae05c0-r-dcb129fd, size 53687091200" func="schedule
    3. r.(*ReplicaScheduler).ScheduleReplica" file="replica_scheduler.go:95"
    4. time="2024-02-19T10:12:56Z" level=warning msg="Failed to schedule replica" func="controller.(*VolumeController).reconcileVolumeCondition" file="volume_controller.go:169
    5. 4" accessMode=rwo controller=longhorn-volume frontend=blockdev migratable=false node=harv41 owner=harv41 replica=pvc-bbe8760d-926c-484a-851c-b8ec29ae05c0-r-dcb129fd sta
    6. te= volume=pvc-bbe8760d-926c-484a-851c-b8ec29ae05c0
    7. ...

Workaround

  1. Disable the rancher-monitoring addon if you have alreay enabled it.

    All pods in cattle-monitoring-system are deleted but the PVCs are retained. For more information, see [Addons].

    1. $ kubectl get pods -n cattle-monitoring-system
    2. No resources found in cattle-monitoring-system namespace.
    3. $ kubectl get pvc -n cattle-monitoring-system
    4. NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
    5. alertmanager-rancher-monitoring-alertmanager-db-alertmanager-rancher-monitoring-alertmanager-0 Bound pvc-cea6316e-f74f-4771-870b-49edb5442819 5Gi RWO harvester-longhorn 14m
    6. prometheus-rancher-monitoring-prometheus-db-prometheus-rancher-monitoring-prometheus-0 Bound pvc-bbe8760d-926c-484a-851c-b8ec29ae05c0 50Gi RWO harvester-longhorn 14m
  2. Delete the PVC named prometheus, but retain the PVC named alertmanager.

    1. $ kubectl delete pvc -n cattle-monitoring-system prometheus-rancher-monitoring-prometheus-db-prometheus-rancher-monitoring-prometheus-0
    2. persistentvolumeclaim "prometheus-rancher-monitoring-prometheus-db-prometheus-rancher-monitoring-prometheus-0" deleted
    3. $ kubectl get pvc -n cattle-monitoring-system
    4. NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
    5. alertmanager-rancher-monitoring-alertmanager-db-alertmanager-rancher-monitoring-alertmanager-0 Bound pvc-cea6316e-f74f-4771-870b-49edb5442819 5Gi RWO harvester-longhorn 16m
  3. On the Addons screen of the Harvester UI, select (menu icon) and then select Edit YAML.

    Monitoring - 图6

  4. As indicated below, change the two occurrences of the number 50 to 30 under prometheusSpec, and then save. The prometheus feature will use a 30GiB disk to store data.

    Monitoring - 图7

    Alternatively, you can use kubectl to edit the object.

    kubectl edit addons.harvesterhci.io -n cattle-monitoring-system rancher-monitoring

    1. retentionSize: 50GiB // Change 50 to 30
    2. storageSpec:
    3. volumeClaimTemplate:
    4. spec:
    5. accessModes:
    6. - ReadWriteOnce
    7. resources:
    8. requests:
    9. storage: 50Gi // Change 50 to 30
    10. storageClassName: harvester-longhorn
  5. Enable the rancher-monitoring addon and wait for a few minutes..

  6. All pods are successfully deployed, and the rancher-monitoring feature is available.

    1. $ kubectl get pods -n cattle-monitoring-system
    2. NAME READY STATUS RESTARTS AGE
    3. alertmanager-rancher-monitoring-alertmanager-0 2/2 Running 0 3m52s
    4. helm-install-rancher-monitoring-s55tq 0/1 Completed 0 4m17s
    5. prometheus-rancher-monitoring-prometheus-0 3/3 Running 0 3m51s
    6. rancher-monitoring-grafana-d6f466988-hkv6f 4/4 Running 0 3m55s
    7. rancher-monitoring-kube-state-metrics-7659b76cc4-ght8x 1/1 Running 0 3m55s
    8. rancher-monitoring-operator-595476bc84-r96bp 1/1 Running 0 3m55s
    9. rancher-monitoring-prometheus-adapter-55dc9ccd5d-vtssc 1/1 Running 0 3m55s
    10. rancher-monitoring-prometheus-node-exporter-lgb88 1/1 Running 0 3m55s

rancher-monitoring-crd ManagedChart State is Modified

Issue Description

In certain situations, the state of the rancher-monitoring-crd ManagedChart object changes to Modified (with the message ...rancher-monitoring-crd-manager missing...).

Example:

  1. $ kubectl get managedchart rancher-monitoring-crd -n fleet-local -o yaml
  2. apiVersion: management.cattle.io/v3
  3. kind: ManagedChart
  4. ...
  5. spec:
  6. chart: rancher-monitoring-crd
  7. defaultNamespace: cattle-monitoring-system
  8. paused: false
  9. releaseName: rancher-monitoring-crd
  10. repoName: harvester-charts
  11. targets:
  12. - clusterName: local
  13. clusterSelector:
  14. matchExpressions:
  15. - key: provisioning.cattle.io/unmanaged-system-agent
  16. operator: DoesNotExist
  17. version: 102.0.0+up40.1.2
  18. ...
  19. status:
  20. conditions:
  21. - lastUpdateTime: "2024-02-22T14:03:11Z"
  22. message: Modified(1) [Cluster fleet-local/local]; clusterrole.rbac.authorization.k8s.io
  23. rancher-monitoring-crd-manager missing; clusterrolebinding.rbac.authorization.k8s.io
  24. rancher-monitoring-crd-manager missing; configmap.v1 cattle-monitoring-system/rancher-monitoring-crd-manifest
  25. missing; serviceaccount.v1 cattle-monitoring-system/rancher-monitoring-crd-manager
  26. missing
  27. status: "False"
  28. type: Ready
  29. - lastUpdateTime: "2024-02-22T14:03:11Z"
  30. status: "True"
  31. type: Processed
  32. - lastUpdateTime: "2024-04-02T07:45:26Z"
  33. status: "True"
  34. type: Defined
  35. display:
  36. readyClusters: 0/1
  37. state: Modified
  38. ...

The ManagedChart object has a downstream object named Bundle, which has similar information.

Example:

  1. $ kubectl get bundles -A
  2. NAMESPACE NAME BUNDLEDEPLOYMENTS-READY STATUS
  3. fleet-local fleet-agent-local 1/1
  4. fleet-local local-managed-system-agent 1/1
  5. fleet-local mcc-harvester 1/1
  6. fleet-local mcc-harvester-crd 1/1
  7. fleet-local mcc-local-managed-system-upgrade-controller 1/1
  8. fleet-local mcc-rancher-logging-crd 1/1
  9. fleet-local mcc-rancher-monitoring-crd 0/1 Modified(1) [Cluster fleet-local/local]; clusterrole.rbac.authorization.k8s.io rancher-monitoring-crd-manager missing; clusterrolebinding.rbac.authorization.k8s.io rancher-monitoring-crd-manager missing; configmap.v1 cattle-monitoring-system/rancher-monitoring-crd-manifest missing; serviceaccount.v1 cattle-monitoring-system/rancher-monitoring-crd-manager missing

When the issue exists and you start an upgrade, Harvester may return the following error message: admission webhook "validator.harvesterhci.io" denied the request: managed chart rancher-monitoring-crd is not ready, please wait for it to be ready.

Also, when you search for the objects marked as missing, you will find that they exist in the cluster.

Example:

  1. $ kubectl get clusterrole rancher-monitoring-crd-manager
  2. apiVersion: rbac.authorization.k8s.io/v1
  3. kind: ClusterRole
  4. metadata:
  5. annotations:
  6. meta.helm.sh/release-name: rancher-monitoring-crd
  7. meta.helm.sh/release-namespace: cattle-monitoring-system
  8. creationTimestamp: "2023-01-09T11:04:33Z"
  9. labels:
  10. app: rancher-monitoring-crd-manager
  11. app.kubernetes.io/managed-by: Helm
  12. name: rancher-monitoring-crd-manager
  13. ...
  14. rules:
  15. - apiGroups:
  16. - apiextensions.k8s.io
  17. resources:
  18. - customresourcedefinitions
  19. verbs:
  20. - create
  21. - get
  22. - patch
  23. - delete
  24. $ kubectl get clusterrolebinding rancher-monitoring-crd-manager
  25. apiVersion: rbac.authorization.k8s.io/v1
  26. kind: ClusterRoleBinding
  27. metadata:
  28. annotations:
  29. meta.helm.sh/release-name: rancher-monitoring-crd
  30. meta.helm.sh/release-namespace: cattle-monitoring-system
  31. creationTimestamp: "2023-01-09T11:04:33Z"
  32. labels:
  33. app: rancher-monitoring-crd-manager
  34. app.kubernetes.io/managed-by: Helm
  35. name: rancher-monitoring-crd-manager
  36. ...
  37. roleRef:
  38. apiGroup: rbac.authorization.k8s.io
  39. kind: ClusterRole
  40. name: rancher-monitoring-crd-manager
  41. subjects:
  42. - kind: ServiceAccount
  43. name: rancher-monitoring-crd-manager
  44. namespace: cattle-monitoring-system
  45. $ kubectl get configmap -n cattle-monitoring-system rancher-monitoring-crd-manifest
  46. apiVersion: v1
  47. data:
  48. crd-manifest.tgz.b64: ...
  49. kind: ConfigMap
  50. metadata:
  51. annotations:
  52. meta.helm.sh/release-name: rancher-monitoring-crd
  53. meta.helm.sh/release-namespace: cattle-monitoring-system
  54. creationTimestamp: "2023-01-09T11:04:33Z"
  55. labels:
  56. app.kubernetes.io/managed-by: Helm
  57. name: rancher-monitoring-crd-manifest
  58. namespace: cattle-monitoring-system
  59. ...
  60. $ kubectl get ServiceAccount -n cattle-monitoring-system rancher-monitoring-crd-manager
  61. apiVersion: v1
  62. kind: ServiceAccount
  63. metadata:
  64. annotations:
  65. meta.helm.sh/release-name: rancher-monitoring-crd
  66. meta.helm.sh/release-namespace: cattle-monitoring-system
  67. creationTimestamp: "2023-01-09T11:04:33Z"
  68. labels:
  69. app: rancher-monitoring-crd-manager
  70. app.kubernetes.io/managed-by: Helm
  71. name: rancher-monitoring-crd-manager
  72. namespace: cattle-monitoring-system
  73. ...

Root Cause

The objects that are marked as missing do not have the related annotations and labels required by the ManagedChart object.

Example:

  1. One of the manually recreated object:
  2. apiVersion: rbac.authorization.k8s.io/v1
  3. kind: ClusterRole
  4. metadata:
  5. annotations:
  6. meta.helm.sh/release-name: rancher-monitoring-crd
  7. meta.helm.sh/release-namespace: cattle-monitoring-system
  8. objectset.rio.cattle.io/id: default-mcc-rancher-monitoring-crd-cattle-fleet-local-system # This required item is not in the above object.
  9. creationTimestamp: "2024-04-03T10:23:55Z"
  10. labels:
  11. app: rancher-monitoring-crd-manager
  12. app.kubernetes.io/managed-by: Helm
  13. objectset.rio.cattle.io/hash: 2da503261617e9ea2da822d2da7cdcfccad847a9 # This required item is not in the above object.
  14. name: rancher-monitoring-crd-manager
  15. ...
  16. rules:
  17. - apiGroups:
  18. - apiextensions.k8s.io
  19. resources:
  20. - customresourcedefinitions
  21. verbs:
  22. - create
  23. - get
  24. - patch
  25. - delete
  26. - update

Workaround

  1. Patch the ClusterRole object rancher-monitoring-crd-manager to add the update operation.

    1. $ cat > patchrules.yaml << EOF
    2. rules:
    3. - apiGroups:
    4. - apiextensions.k8s.io
    5. resources:
    6. - customresourcedefinitions
    7. verbs:
    8. - create
    9. - get
    10. - patch
    11. - delete
    12. - update
    13. EOF
    14. $ kubectl patch ClusterRole rancher-monitoring-crd-manager --patch-file ./patchrules.yaml --type merge
    15. $ rm ./patchrules.yaml
  2. Patch the marked as missing objects to add the required annotations and labels.

    1. $ cat > patchhash.yaml << EOF
    2. metadata:
    3. annotations:
    4. objectset.rio.cattle.io/id: default-mcc-rancher-monitoring-crd-cattle-fleet-local-system
    5. labels:
    6. objectset.rio.cattle.io/hash: 2da503261617e9ea2da822d2da7cdcfccad847a9
    7. EOF
    8. $ kubectl patch ClusterRole rancher-monitoring-crd-manager --patch-file ./patchhash.yaml --type merge
    9. $ kubectl patch ClusterRoleBinding rancher-monitoring-crd-manager --patch-file ./patchhash.yaml --type merge
    10. $ kubectl patch ServiceAccount rancher-monitoring-crd-manager -n cattle-monitoring-system --patch-file ./patchhash.yaml --type merge
    11. $ kubectl patch ConfigMap rancher-monitoring-crd-manifest -n cattle-monitoring-system --patch-file ./patchhash.yaml --type merge
    12. $ rm ./patchhash.yaml
  3. Check the rancher-monitoring-crd ManagedChart object.

    After a few seconds, the status of the rancher-monitoring-crd ManagedChart object changes to Ready.

    1. $ kubectl get managedchart -n fleet-local rancher-monitoring-crd -oyaml
    2. apiVersion: management.cattle.io/v3
    3. kind: ManagedChart
    4. metadata:
    5. ...
    6. name: rancher-monitoring-crd
    7. namespace: fleet-local
    8. ...
    9. status:
    10. conditions:
    11. - lastUpdateTime: "2024-04-22T21:41:44Z"
    12. status: "True"
    13. type: Ready
    14. ...

    Also, error indicators are no longer displayed for the downstream objects.

    1. $ kubectl bundle -A
    2. NAMESPACE NAME BUNDLEDEPLOYMENTS-READY STATUS
    3. fleet-local fleet-agent-local 1/1
    4. fleet-local local-managed-system-agent 1/1
    5. fleet-local mcc-harvester 1/1
    6. fleet-local mcc-harvester-crd 1/1
    7. fleet-local mcc-local-managed-system-upgrade-controller 1/1
    8. fleet-local mcc-rancher-logging-crd 1/1
    9. fleet-local mcc-rancher-monitoring-crd 1/1
  4. (Optional) Retry the upgrade (if previously unsuccessful because of this issue).

https://github.com/harvester/harvester/issues/5505