Creating a Cluster
Once you’ve installed the M3DB operator and read over the requirements, you can start creating some M3DB clusters!
Basic Cluster
The following creates an M3DB cluster spread across 3 zones, with each M3DB instance being able to store up to 350gb of data using your Kubernetes cluster’s default storage class. For examples of different cluster topologies, such as zonal clusters, see the docs on node affinity.
Etcd
Create an etcd cluster with persistent volumes:
kubectl apply -f https://raw.githubusercontent.com/m3db/m3db-operator/v0.10.0/example/etcd/etcd-pd.yaml
We recommend modifying the storageClassName
in the manifest to one that matches your cloud provider’s fastest remote storage option, such as pd-ssd
on GCP.
M3DB
apiVersion: operator.m3db.io/v1alpha1
kind: M3DBCluster
metadata:
name: persistent-cluster
spec:
image: quay.io/m3db/m3dbnode:latest
replicationFactor: 3
numberOfShards: 256
isolationGroups:
- name: group1
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-a>
- name: group2
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-b>
- name: group3
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-c>
etcdEndpoints:
- http://etcd-0.etcd:2379
- http://etcd-1.etcd:2379
- http://etcd-2.etcd:2379
podIdentityConfig:
sources: []
namespaces:
- name: metrics-10s:2d
preset: 10s:2d
dataDirVolumeClaimTemplate:
metadata:
name: m3db-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 350Gi
limits:
storage: 350Gi
Ephemeral Cluster
WARNING: This setup is not intended for production-grade clusters, but rather for “kicking the tires” with the operator and M3DB. It is intended to work across almost any Kubernetes environment, and as such has as few dependencies as possible (namely persistent storage). See below for instructions on creating a more durable cluster.
Etcd
Create an etcd cluster in the same namespace your M3DB cluster will be created in. If you don’t have persistent storage available, this will create a cluster that will not use persistent storage and will likely become unavailable if any of the pods die:
kubectl apply -f https://raw.githubusercontent.com/m3db/m3db-operator/v0.10.0/example/etcd/etcd-basic.yaml
# Verify etcd health once pods available
kubectl exec etcd-0 -- env ETCDCTL_API=3 etcdctl endpoint health
# 127.0.0.1:2379 is healthy: successfully committed proposal: took = 2.94668ms
If you have remote storage available and would like to jump straight to using it, apply the following manifest for etcd instead:
kubectl apply -f https://raw.githubusercontent.com/m3db/m3db-operator/v0.10.0/example/etcd/etcd-pd.yaml
M3DB
Once etcd is available, you can create an M3DB cluster. An example of a very basic M3DB cluster definition is as follows:
apiVersion: operator.m3db.io/v1alpha1
kind: M3DBCluster
metadata:
name: simple-cluster
spec:
image: quay.io/m3db/m3dbnode:latest
replicationFactor: 3
numberOfShards: 256
etcdEndpoints:
- http://etcd-0.etcd:2379
- http://etcd-1.etcd:2379
- http://etcd-2.etcd:2379
isolationGroups:
- name: group1
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-a>
- name: group2
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-b>
- name: group3
numInstances: 1
nodeAffinityTerms:
- key: failure-domain.beta.kubernetes.io/zone
values:
- <zone-c>
podIdentityConfig:
sources:
- PodUID
namespaces:
- name: metrics-10s:2d
preset: 10s:2d
This will create a highly available cluster with RF=3 spread evenly across the three given zones within a region. A pod’s UID will be used for its identity. The cluster will have 1 namespace that stores metrics for 2 days at 10s resolution.
Next, apply your manifest:
$ kubectl apply -f example/simple-cluster.yaml
m3dbcluster.operator.m3db.io/simple-cluster created
Shortly after all pods are created you should see the cluster ready!
$ kubectl get po -l operator.m3db.io/app=m3db
NAME READY STATUS RESTARTS AGE
simple-cluster-rep0-0 1/1 Running 0 1m
simple-cluster-rep1-0 1/1 Running 0 56s
simple-cluster-rep2-0 1/1 Running 0 37s
We can verify that the cluster has finished streaming data by peers by checking that an instance has bootstrapped:
$ kubectl exec simple-cluster-rep2-0 -- curl -sSf localhost:9002/health
{"ok":true,"status":"up","bootstrapped":true}