Deploy a TiDB Cluster across Multiple Kubernetes Clusters

To deploy a TiDB cluster across multiple Kubernetes clusters refers to deploying one TiDB cluster on multiple interconnected Kubernetes clusters. Each component of the cluster is distributed on multiple Kubernetes clusters to achieve disaster recovery among Kubernetes clusters. The interconnected network of Kubernetes clusters means that Pod IP can be accessed in any cluster and between clusters, and Pod FQDN records can be looked up by querying the DNS service in any cluster and between clusters.

Prerequisites

You need to configure the Kubernetes network and DNS so that the Kubernetes cluster meets the following conditions:

  • The TiDB components on each Kubernetes cluster can access the Pod IP of all TiDB components in and between clusters.
  • The TiDB components on each Kubernetes cluster can look up the Pod FQDN of all TiDB components in and between clusters.

To build multiple connected EKS or GKE clusters, refer to Build Multiple Interconnected AWS EKS Clusters or Build Multiple Interconnected GCP GKE Clusters.

Supported scenarios

Currently supported scenarios:

  • Deploy a new TiDB cluster across multiple Kubernetes clusters.
  • Deploy new TiDB clusters that enable this feature on other Kubernetes clusters and join the initial TiDB cluster.

Experimentally supported scenarios:

  • Enable this feature for a cluster that already has data. If you need to perform this action in a production environment, it is recommended to complete this requirement through data migration.

Unsupported scenarios:

  • You cannot interconnect two clusters that already have data. You might perform this action through data migration.

Deploy a cluster across multiple Kubernetes clusters

Before you deploy a TiDB cluster across multiple Kubernetes clusters, you need to first deploy the Kubernetes clusters required for this operation. The following deployment assumes that you have completed Kubernetes deployment.

The following takes the deployment of one TiDB cluster across two Kubernetes clusters as an example. One TidbCluster is deployed in each Kubernetes cluster.

In the following sections, ${tc_name_1} and ${tc_name_2} refer to the name of TidbCluster that will be deployed in each Kubernetes cluster. ${namespace_1} and ${namespace_2} refer to the namespace of TidbCluster. ${cluster_domain_1} and ${cluster_domain_2} refer to the Cluster Domain of each Kubernetes cluster.

Step 1. Deploy the initial TidbCluster

Create and deploy the initial TidbCluster.

  1. cat << EOF | kubectl apply -n ${namespace_1} -f -
  2. apiVersion: pingcap.com/v1alpha1
  3. kind: TidbCluster
  4. metadata:
  5. name: "${tc_name_1}"
  6. spec:
  7. version: v5.4.0
  8. timezone: UTC
  9. pvReclaimPolicy: Delete
  10. enableDynamicConfiguration: true
  11. configUpdateStrategy: RollingUpdate
  12. clusterDomain: "${cluster_domain_1}"
  13. acrossK8s: true
  14. discovery: {}
  15. pd:
  16. baseImage: pingcap/pd
  17. maxFailoverCount: 0
  18. replicas: 1
  19. requests:
  20. storage: "10Gi"
  21. config: {}
  22. tikv:
  23. baseImage: pingcap/tikv
  24. maxFailoverCount: 0
  25. replicas: 1
  26. requests:
  27. storage: "10Gi"
  28. config: {}
  29. tidb:
  30. baseImage: pingcap/tidb
  31. maxFailoverCount: 0
  32. replicas: 1
  33. service:
  34. type: ClusterIP
  35. config: {}
  36. EOF

The descriptions of the related fields are as follows:

  • spec.acrossK8s: Specifies whether the TiDB cluster is deployed across Kubernetes clusters. In this example, this field must be set to true.

  • spec.clusterDomain: If this field is set, the Pod FQDN which contains the cluster domain is used as the address for inter-component access.

    Take Pod ${tc_name}-pd-0 as an example: Pods in other Kubernetes clusters access this Pod using the ${tc_name}-pd-0.${tc_name}-pd-peer.${ns}.svc.${cluster_domain} address.

    If the cluster domain is required when Pods access the Pod FQDB of another Kubernetes cluster, you must set this field.

Step 2. Deploy the new TidbCluster to join the TiDB cluster

After the initial cluster completes the deployment, you can deploy the new TidbCluster to join the TiDB cluster. You can create a new TidbCluster to join any existing TidbCluster.

  1. cat << EOF | kubectl apply -n ${namespace_2} -f -
  2. apiVersion: pingcap.com/v1alpha1
  3. kind: TidbCluster
  4. metadata:
  5. name: "${tc_name_2}"
  6. spec:
  7. version: v5.4.0
  8. timezone: UTC
  9. pvReclaimPolicy: Delete
  10. enableDynamicConfiguration: true
  11. configUpdateStrategy: RollingUpdate
  12. clusterDomain: "${cluster_domain_2}"
  13. acrossK8s: true
  14. cluster:
  15. name: "${tc_name_1}"
  16. namespace: "${namespace_1}"
  17. clusterDomain: "${cluster_domain_1}"
  18. discovery: {}
  19. pd:
  20. baseImage: pingcap/pd
  21. maxFailoverCount: 0
  22. replicas: 1
  23. requests:
  24. storage: "10Gi"
  25. config: {}
  26. tikv:
  27. baseImage: pingcap/tikv
  28. maxFailoverCount: 0
  29. replicas: 1
  30. requests:
  31. storage: "10Gi"
  32. config: {}
  33. tidb:
  34. baseImage: pingcap/tidb
  35. maxFailoverCount: 0
  36. replicas: 1
  37. service:
  38. type: ClusterIP
  39. config: {}
  40. EOF

Deploy the TLS-enabled TiDB cluster across multiple Kubernetes clusters

You can follow the steps below to enable TLS between TiDB components for TiDB clusters deployed across multiple Kubernetes clusters.

The following takes the deployment of a TiDB cluster across two Kubernetes clusters as an example. One TidbCluster is deployed in each Kubernetes cluster.

In the following sections, ${tc_name_1} and ${tc_name_2} refer to the name of TidbCluster that will be deployed in each Kubernetes cluster. ${namespace_1} and ${namespace_2} refer to the namespace of TidbCluster. ${cluster_domain_1} and ${cluster_domain_2} refer to the Cluster Domain of each Kubernetes cluster.

Step 1. Issue the root certificate

Use cfssl

If you use cfssl, the CA certificate issue process is the same as the general issue process. You need to save the CA certificate created for the first time, and use this CA certificate when you issue certificates for TiDB components later.

In other words, when you create a component certificate in a cluster, you do not need to create a CA certificate again. Complete step 1 ~ 4 in Enabling TLS between TiDB components once to issue the CA certificate. After that, start from step 5 to issue certificates between other cluster components.

Use cert-manager

If you use cert-manager, you only need to create a CA Issuer and a CA Certificate in the initial cluster, and export the CA Secret to other new clusters that want to join.

For other clusters, you only need to create a component certificate Issuer (refers to ${cluster_name}-tidb-issuer in the TLS document) and configure the Issuer to use the CA. The detailed process is as follows:

  1. Create a CA Issuer and a CA Certificate in the initial Kubernetes cluster.

    Run the following command:

    1. cat <<EOF | kubectl apply -f -
    2. apiVersion: cert-manager.io/v1
    3. kind: Issuer
    4. metadata:
    5. name: ${tc_name_1}-selfsigned-ca-issuer
    6. namespace: ${namespace}
    7. spec:
    8. selfSigned: {}
    9. ---
    10. apiVersion: cert-manager.io/v1
    11. kind: Certificate
    12. metadata:
    13. name: ${tc_name_1}-ca
    14. namespace: ${namespace_1}
    15. spec:
    16. secretName: ${tc_name_1}-ca-secret
    17. commonName: "TiDB"
    18. isCA: true
    19. duration: 87600h # 10yrs
    20. renewBefore: 720h # 30d
    21. issuerRef:
    22. name: ${tc_name_1}-selfsigned-ca-issuer
    23. kind: Issuer
    24. EOF
  2. Export the CA and delete irrelevant information.

    First, you need to export the Secret that stores the CA. The name of the Secret can be obtained from .spec.secretName of the Certificate YAML file in the first step.

    1. kubectl get secret ${tc_name_1}-ca-secret -n ${namespace_1} -o yaml > ca.yaml

    Delete irrelevant information in the Secret YAML file. After the deletion, the YAML file is as follows (the information in data is omitted):

    1. apiVersion: v1
    2. data:
    3. ca.crt: LS0...LQo=
    4. tls.crt: LS0t....LQo=
    5. tls.key: LS0t...tCg==
    6. kind: Secret
    7. metadata:
    8. name: ${tc_name_2}-ca-secret
    9. type: kubernetes.io/tls
  3. Import the exported CA to other clusters.

    You need to configure the namespace so that related components can access the CA certificate:

    1. kubectl apply -f ca.yaml -n ${namespace_2}
  4. Create a component certificate Issuer in all Kubernetes clusters and configure it to use this CA.

    1. In the initial Kubernetes cluster, create an Issuer that issues certificates between TiDB components.

      Run the following command:

      1. cat << EOF | kubectl apply -f -
      2. apiVersion: cert-manager.io/v1
      3. kind: Issuer
      4. metadata:
      5. name: ${tc_name_1}-tidb-issuer
      6. namespace: ${namespace_1}
      7. spec:
      8. ca:
      9. secretName: ${tc_name_1}-ca-secret
      10. EOF
    2. In other Kubernetes clusters, create an Issuer that issues certificates between TiDB components.

      Run the following command:

      1. cat << EOF | kubectl apply -f -
      2. apiVersion: cert-manager.io/v1
      3. kind: Issuer
      4. metadata:
      5. name: ${tc_name_2}-tidb-issuer
      6. namespace: ${namespace_2}
      7. spec:
      8. ca:
      9. secretName: ${tc_name_2}-ca-secret
      10. EOF

Step 2. Issue certificates for the TiDB components of each Kubernetes cluster

You need to issue a component certificate for each TiDB component on the Kubernetes cluster. When issuing a component certificate, you need to add an authorization record ending with .${cluster_domain} to the hosts, for example, the record of the initial TidbCluster is ${tc_name_1}-pd.${namespace_1}.svc.${cluster_domain_1}.

Use the cfssl system to issue certificates for TiDB components

The following example shows how to use cfssl to create a certificate used by PD. Run the following command to create the pd-server.json file for the initial TidbCluster.

  1. cat << EOF > pd-server.json
  2. {
  3. "CN": "TiDB",
  4. "hosts": [
  5. "127.0.0.1",
  6. "::1",
  7. "${tc_name_1}-pd",
  8. "${tc_name_1}-pd.${namespace_1}",
  9. "${tc_name_1}-pd.${namespace_1}.svc",
  10. "${tc_name_1}-pd.${namespace_1}.svc.${cluster_domain_1}",
  11. "${tc_name_1}-pd-peer",
  12. "${tc_name_1}-pd-peer.${namespace_1}",
  13. "${tc_name_1}-pd-peer.${namespace_1}.svc",
  14. "${tc_name_1}-pd-peer.${namespace_1}.svc.${cluster_domain_1}",
  15. "*.${tc_name_1}-pd-peer",
  16. "*.${tc_name_1}-pd-peer.${namespace_1}",
  17. "*.${tc_name_1}-pd-peer.${namespace_1}.svc",
  18. "*.${tc_name_1}-pd-peer.${namespace_1}.svc.${cluster_domain_1}"
  19. ],
  20. "key": {
  21. "algo": "ecdsa",
  22. "size": 256
  23. },
  24. "names": [
  25. {
  26. "C": "US",
  27. "L": "CA",
  28. "ST": "San Francisco"
  29. }
  30. ]
  31. }
  32. EOF

Use the cert-manager system to issue certificates for TiDB components

The following example shows how to use cert-manager to create a certificate used by PD for the initial TidbCluster. Certifcates is shown below.

  1. cat << EOF | kubectl apply -f -
  2. apiVersion: cert-manager.io/v1
  3. kind: Certificate
  4. metadata:
  5. name: ${tc_name_1}-pd-cluster-secret
  6. namespace: ${namespace_1}
  7. spec:
  8. secretName: ${tc_name_1}-pd-cluster-secret
  9. duration: 8760h # 365d
  10. renewBefore: 360h # 15d
  11. subject:
  12. organizations:
  13. - PingCAP
  14. commonName: "TiDB"
  15. usages:
  16. - server auth
  17. - client auth
  18. dnsNames:
  19. - "${tc_name_1}-pd"
  20. - "${tc_name_1}-pd.${namespace_1}"
  21. - "${tc_name_1}-pd.${namespace_1}.svc"
  22. - "${tc_name_1}-pd.${namespace_1}.svc.${cluster_domain_1}"
  23. - "${tc_name_1}-pd-peer"
  24. - "${tc_name_1}-pd-peer.${namespace_1}"
  25. - "${tc_name_1}-pd-peer.${namespace_1}.svc"
  26. - "${tc_name_1}-pd-peer.${namespace_1}.svc.${cluster_domain_1}"
  27. - "*.${tc_name_1}-pd-peer"
  28. - "*.${tc_name_1}-pd-peer.${namespace_1}"
  29. - "*.${tc_name_1}-pd-peer.${namespace_1}.svc"
  30. - "*.${tc_name_1}-pd-peer.${namespace_1}.svc.${cluster_domain_1}"
  31. ipAddresses:
  32. - 127.0.0.1
  33. - ::1
  34. issuerRef:
  35. name: ${tc_name_1}-tidb-issuer
  36. kind: Issuer
  37. group: cert-manager.io
  38. EOF

You need to refer to the TLS-related documents, issue the corresponding certificates for the components, and create the Secret in the corresponding Kubernetes clusters.

For other TLS-related information, refer to the following documents:

Step 3. Deploy the initial TidbCluster

Run the following commands to deploy the initial TidbCluster. The following YAML file enables the TLS feature and configures cert-allowed-cn, which makes each component start to verify the certificates issued by the CN for the CA of TiDB.

  1. cat << EOF | kubectl apply -n ${namespace_1} -f -
  2. apiVersion: pingcap.com/v1alpha1
  3. kind: TidbCluster
  4. metadata:
  5. name: "${tc_name_1}"
  6. spec:
  7. version: v5.4.0
  8. timezone: UTC
  9. tlsCluster:
  10. enabled: true
  11. pvReclaimPolicy: Delete
  12. enableDynamicConfiguration: true
  13. configUpdateStrategy: RollingUpdate
  14. clusterDomain: "${cluster_domain_1}"
  15. acrossK8s: true
  16. discovery: {}
  17. pd:
  18. baseImage: pingcap/pd
  19. maxFailoverCount: 0
  20. replicas: 1
  21. requests:
  22. storage: "10Gi"
  23. config:
  24. security:
  25. cert-allowed-cn:
  26. - TiDB
  27. tikv:
  28. baseImage: pingcap/tikv
  29. maxFailoverCount: 0
  30. replicas: 1
  31. requests:
  32. storage: "10Gi"
  33. config:
  34. security:
  35. cert-allowed-cn:
  36. - TiDB
  37. tidb:
  38. baseImage: pingcap/tidb
  39. maxFailoverCount: 0
  40. replicas: 1
  41. service:
  42. type: ClusterIP
  43. tlsClient:
  44. enabled: true
  45. config:
  46. security:
  47. cert-allowed-cn:
  48. - TiDB
  49. EOF

Step 4. Deploy a new TidbCluster to join the TiDB cluster

After the initial cluster completes the deployment, you can deploy the new TidbCluster to join the TiDB cluster. You can create a new TidbCluster to join any existing TidbCluster.

  1. cat << EOF | kubectl apply -n ${namespace_2} -f -
  2. apiVersion: pingcap.com/v1alpha1
  3. kind: TidbCluster
  4. metadata:
  5. name: "${tc_name_2}"
  6. spec:
  7. version: v5.4.0
  8. timezone: UTC
  9. tlsCluster:
  10. enabled: true
  11. pvReclaimPolicy: Delete
  12. enableDynamicConfiguration: true
  13. configUpdateStrategy: RollingUpdate
  14. clusterDomain: "${cluster_domain_2}"
  15. acrossK8s: true
  16. cluster:
  17. name: "${tc_name_1}"
  18. namespace: "${namespace_1}"
  19. clusterDomain: "${cluster_domain_1}"
  20. discovery: {}
  21. pd:
  22. baseImage: pingcap/pd
  23. maxFailoverCount: 0
  24. replicas: 1
  25. requests:
  26. storage: "10Gi"
  27. config:
  28. security:
  29. cert-allowed-cn:
  30. - TiDB
  31. tikv:
  32. baseImage: pingcap/tikv
  33. maxFailoverCount: 0
  34. replicas: 1
  35. requests:
  36. storage: "10Gi"
  37. config:
  38. security:
  39. cert-allowed-cn:
  40. - TiDB
  41. tidb:
  42. baseImage: pingcap/tidb
  43. maxFailoverCount: 0
  44. replicas: 1
  45. service:
  46. type: ClusterIP
  47. tlsClient:
  48. enabled: true
  49. config:
  50. security:
  51. cert-allowed-cn:
  52. - TiDB
  53. EOF

Upgrade TiDB Cluster

For a TiDB cluster deployed across Kubernetes clusters, to perform a rolling upgrade for each component Pod of the TiDB cluster, take the following steps in sequence to modify the version configuration of each component in the TidbCluster spec for each Kubernetes cluster.

  1. Upgrade PD versions for all Kubernetes clusters.

    1. Modify the spec.pd.version field in the spec for the initial TidbCluster.

      1. apiVersion: pingcap.com/v1alpha1
      2. kind: TidbCluster
      3. # ...
      4. spec:
      5. pd:
      6. version: ${version}
    2. Watch the status of PD Pods and wait for PD Pods in the initial TidbCluster to finish recreation and become Running.

    3. Repeat the first two substeps to upgrade all PD Pods in other TidbCluster.

  2. Take step 1 as an example, perform the following upgrade operations in sequence:

    1. If TiFlash is deployed in clusters, upgrade the TiFlash versions for all the Kubernetes clusters that have TiFlash deployed.
    2. Upgrade TiKV versions for all Kubernetes clusters.
    3. If Pump is deployed in clusters, upgrade the Pump versions for all the Kubernetes clusters that have Pump deployed.
    4. Upgrade TiDM versions for all Kubernetes clusters.
    5. If TiCDC is deployed in clusters, upgrade the TiCDC versions for all the Kubernetes clusters that have TiCDC deployed.

Exit and reclaim TidbCluster that already join a cross-Kubernetes cluster

When you need to make a cluster exit from the joined TiDB cluster deployed across Kubernetes and reclaim resources, you can perform the operation by scaling in the cluster. In this scenario, the following requirements of scaling-in need to be met.

  • After scaling in the cluster, the number of TiKV replicas in the cluster should be greater than the number of max-replicas set in PD. By default, the number of TiKV replicas needs to be greater than three.

Take the second TidbCluster created in the last section as an example. First, set the number of replicas of PD, TiKV, and TiDB to 0. If you have enabled other components such as TiFlash, TiCDC, and Pump, set the number of these replicas to 0:

  1. kubectl patch tc ${tc_name_2} -n ${namespace_2} --type merge -p '{"spec":{"pd":{"replicas":0},"tikv":{"replicas":0},"tidb":{"replicas":0}}}'

Wait for the status of the second TidbCluster to become Ready, and scale in related components to 0 replica:

  1. kubectl get pods -l app.kubernetes.io/instance=${tc_name_2} -n ${namespace_2}

The Pod list shows No resources found. At this time, all Pods have been scaled in, and the second TidbCluster exits the cluster. Check the cluster status of the second TidbCluster:

  1. kubectl get tc ${tc_name_2} -n ${namespace_2}

The result shows that the second TidbCluster is in the Ready status. At this time, you can delete the object and reclaim related resources.

  1. kubectl delete tc ${tc_name_2} -n ${namespace_2}

Through the above steps, you can complete exit and resources reclaim of the joined clusters.

Enable the feature for a cluster with existing data and make it the initial TiDB cluster

Deploy TiDB Across Multiple Kubernetes Clusters - 图1Warning

Currently, this is an experimental feature and might cause data loss. Please use it carefully.

A cluster with existing data refer to a deployed TiDB cluster with the configuration spec.acrossK8s: false.

Depending on the network between multiple Kubernetes clusters, there are different methods.

If all Kubernetes have the same Cluster Domain, you only need to update the spec.crossK8s configuration of TidbCluster. Run the following command:

  1. kubectl patch tidbcluster cluster1 --type merge -p '{"spec":{"acrossK8s": true}}'

After the modification, wait for the TiDB cluster to complete rolling update.

If each Kubernetes have different Cluster Domain, you need to update the spec.clusterDomain and spec.acrossK8s fields. Take the following steps:

  1. Update the spec.clusterDomain and spec.acrossK8s fields:

    Configure the following parameters according to the clusterDomain in your Kubernetes cluster information:

    Deploy TiDB Across Multiple Kubernetes Clusters - 图2Warning

    Currently, you need to configure clusterDomain with correct information. After modifying the configuration, you can not modify it again.

    1. kubectl patch tidbcluster cluster1 --type merge -p '{"spec":{"clusterDomain":"cluster1.com", "acrossK8s": true}}'

    After completing the modification, the TiDB cluster performs the rolling update.

  2. Update the PeerURL information of PD:

    After completing the rolling update, you need to use port-forward to expose PD’s API, and use API of PD to update PeerURL of PD.

    1. Use port-forward to expose API of PD:

      1. kubectl port-forward pods/cluster1-pd-0 2380:2380 2379:2379 -n pingcap
    2. Access PD API to obtain members information. Note that after using port-forward, the terminal session is occupied. You need to perform the following operations in another terminal session:

      1. curl http://127.0.0.1:2379/v2/members

      Deploy TiDB Across Multiple Kubernetes Clusters - 图3Note

      If the cluster enables TLS, you need to configure the certificate when using the curl command. For example:

      curl --cacert /var/lib/pd-tls/ca.crt --cert /var/lib/pd-tls/tls.crt --key /var/lib/pd-tls/tls.key https://127.0.0.1:2379/v2/members

      After running the command, the output is as follows:

      1. {"members":[{"id":"6ed0312dc663b885","name":"cluster1-pd-0.cluster1-pd-peer.pingcap.svc.cluster1.com","peerURLs":["http://cluster1-pd-0.cluster1-pd-peer.pingcap.svc:2380"],"clientURLs":["http://cluster1-pd-0.cluster1-pd-peer.pingcap.svc.cluster1.com:2379"]},{"id":"bd9acd3d57e24a32","name":"cluster1-pd-1.cluster1-pd-peer.pingcap.svc.cluster1.com","peerURLs":["http://cluster1-pd-1.cluster1-pd-peer.pingcap.svc:2380"],"clientURLs":["http://cluster1-pd-1.cluster1-pd-peer.pingcap.svc.cluster1.com:2379"]},{"id":"e04e42cccef60246","name":"cluster1-pd-2.cluster1-pd-peer.pingcap.svc.cluster1.com","peerURLs":["http://cluster1-pd-2.cluster1-pd-peer.pingcap.svc:2380"],"clientURLs":["http://cluster1-pd-2.cluster1-pd-peer.pingcap.svc.cluster1.com:2379"]}]}
    3. Record the id of each PD instance, and use the id to update the peerURL of each member in turn:

      1. member_ID="6ed0312dc663b885"
      2. member_peer_url="http://cluster1-pd-0.cluster1-pd-peer.pingcap.svc.cluster1.com:2380"
      3. curl http://127.0.0.1:2379/v2/members/${member_ID} -XPUT \
      4. -H "Content-Type: application/json" -d '{"peerURLs":["${member_peer_url}"]}'

After completing the above steps, this TidbCluster can be used as the initial TidbCluster for TiDB cluster deployment across Kubernetes clusters. You can refer the section to deploy other TidbCluster.

For more examples and development information, refer to multi-cluster.