Cluster Deployment

It’s highly recommended to deploy the GreptimeDB cluster in Kubernetes. There are the following prerequires:

  • Kubernetes(>=1.18)

    For testing purposes, you can use Kind or Minikube to create Kubernetes.

  • Helm v3

  • kubectl

Step 1: Deploy the GreptimeDB Operator

Add the chart repository with the following commands:

  1. helm repo add greptime https://greptimeteam.github.io/helm-charts/
  2. helm repo update

Create the greptimedb-admin namespace and deploy the GreptimeDB operator in the namespace:

  1. kubectl create ns greptimedb-admin
  2. helm upgrade --install greptimedb-operator greptime/greptimedb-operator -n greptimedb-admin

Step 2: Deploy the etcd Cluster

The GreptimeDB cluster needs the etcd cluster as the backend storage of the metasrv. We recommend using the Bitnami etcd chart to deploy the etcd cluster:

  1. kubectl create ns metasrv-store
  2. helm upgrade --install etcd oci://registry-1.docker.io/bitnamicharts/etcd \
  3. --set replicaCount=3 \
  4. --set auth.rbac.create=false \
  5. --set auth.rbac.token.enabled=false \
  6. -n metasrv-store

When the etcd cluster is ready, you can use the following command to check the cluster health:

  1. kubectl -n metasrv-store \
  2. exec etcd-0 -- etcdctl \
  3. --endpoints etcd-0.etcd-headless.metasrv-store:2379,etcd-1.etcd-headless.metasrv-store:2379,etcd-2.etcd-headless.metasrv-store:2379 \
  4. endpoint status

Step 3: Deploy the Kafka Cluster

We recommend using strimzi-kafka-operator to deploy the Kafka cluster in KRaft mode.

Create the kafka namespace and install the strimzi-kafka-operator:

  1. kubectl create namespace kafka
  2. kubectl create -f 'https://strimzi.io/install/latest?namespace=kafka' -n kafka

When the operator is ready, use the following spec to create the Kafka cluster:

  1. apiVersion: kafka.strimzi.io/v1beta2
  2. kind: KafkaNodePool
  3. metadata:
  4. name: dual-role
  5. labels:
  6. strimzi.io/cluster: kafka-wal
  7. spec:
  8. replicas: 3
  9. roles:
  10. - controller
  11. - broker
  12. storage:
  13. type: jbod
  14. volumes:
  15. - id: 0
  16. type: persistent-claim
  17. size: 20Gi
  18. deleteClaim: false
  19. ---
  20. apiVersion: kafka.strimzi.io/v1beta2
  21. kind: Kafka
  22. metadata:
  23. name: kafka-wal
  24. annotations:
  25. strimzi.io/node-pools: enabled
  26. strimzi.io/kraft: enabled
  27. spec:
  28. kafka:
  29. version: 3.7.0
  30. metadataVersion: 3.7-IV4
  31. listeners:
  32. - name: plain
  33. port: 9092
  34. type: internal
  35. tls: false
  36. - name: tls
  37. port: 9093
  38. type: internal
  39. tls: true
  40. config:
  41. offsets.topic.replication.factor: 3
  42. transaction.state.log.replication.factor: 3
  43. transaction.state.log.min.isr: 2
  44. default.replication.factor: 3
  45. min.insync.replicas: 2
  46. entityOperator:
  47. topicOperator: {}
  48. userOperator: {}

Save the spec as kafka-wal.yaml and apply in the Kubernetes:

  1. kubectl apply -f kafka-wal.yaml -n kafka

After the Kafka cluster is ready, check the status:

  1. kubectl get kafka -n kafka

The expected output will be:

  1. NAME DESIRED KAFKA REPLICAS DESIRED ZK REPLICAS READY METADATA STATE WARNINGS
  2. kafka-wal True KRaft

Step 4: Deploy the GreptimeDB Cluster with Remote WAL Settings

Create a GreptimeDB cluster with remote WAL settings:

  1. cat <<EOF | kubectl apply -f -
  2. apiVersion: greptime.io/v1alpha1
  3. kind: GreptimeDBCluster
  4. metadata:
  5. name: my-cluster
  6. namespace: default
  7. spec:
  8. base:
  9. main:
  10. image: greptime/greptimedb:latest
  11. frontend:
  12. replicas: 1
  13. meta:
  14. replicas: 1
  15. etcdEndpoints:
  16. - "etcd.metasrv-store:2379"
  17. datanode:
  18. replicas: 3
  19. remoteWal:
  20. kafka:
  21. brokerEndpoints:
  22. - "kafka-wal-kafka-bootstrap.kafka:9092"
  23. EOF

When the GreptimeDB cluster is ready, you can check the cluster status:

  1. kubectl get gtc my-cluster -n default

The expected output will be:

  1. NAME FRONTEND DATANODE META PHASE VERSION AGE
  2. my-cluster 1 3 1 Running latest 5m30s

Step 5: Write and Query Data

Let’s choose to connect the cluster using the MySQL protocol. Use the kubectl to port forward 4002 traffic:

  1. kubectl port-forward svc/my-cluster-frontend 4002:4002 -n default

Open another terminal and connect the cluster by mysql:

  1. mysql -h 127.0.0.1 -P 4002

Create a distributed table:

  1. CREATE TABLE dist_table(
  2. ts TIMESTAMP DEFAULT current_timestamp(),
  3. n INT,
  4. row_id INT,
  5. PRIMARY KEY(n),
  6. TIME INDEX (ts)
  7. )
  8. PARTITION ON COLUMNS (n) (
  9. n < 5,
  10. n >= 5 AND n < 9,
  11. n >= 9
  12. )
  13. engine=mito;

Write the data:

  1. INSERT INTO dist_table(n, row_id) VALUES (1, 1);
  2. INSERT INTO dist_table(n, row_id) VALUES (2, 2);
  3. INSERT INTO dist_table(n, row_id) VALUES (3, 3);
  4. INSERT INTO dist_table(n, row_id) VALUES (4, 4);
  5. INSERT INTO dist_table(n, row_id) VALUES (5, 5);
  6. INSERT INTO dist_table(n, row_id) VALUES (6, 6);
  7. INSERT INTO dist_table(n, row_id) VALUES (7, 7);
  8. INSERT INTO dist_table(n, row_id) VALUES (8, 8);
  9. INSERT INTO dist_table(n, row_id) VALUES (9, 9);
  10. INSERT INTO dist_table(n, row_id) VALUES (10, 10);
  11. INSERT INTO dist_table(n, row_id) VALUES (11, 11);
  12. INSERT INTO dist_table(n, row_id) VALUES (12, 12);

And query the data:

  1. SELECT * from dist_table;