- Operational Guides
- Replication and Deployment in Zones
- Metrics
- Upgrading M3
- Resource Limits and Preventing Abusive Reads/Writes
- Tuning Availability, Consistency, and Durability
- Placement
- Placement Configuration
- Namespace Configuration
- Bootstrapping & Crash Recovery
- Docker & Kernel Configuration
- etcd
- Mapping Rules
- Background Repairs
- Replication between clusters
- Fileset Migrations
Operational Guides
Replication and Deployment in Zones
M3DB supports both deploying across multiple zones in a region or deploying to a single zone with rack-level isolation. It can also be deployed across multiple regions for a global view of data, though both latency and bandwidth costs may increase as a result. In addition, M3DB has support for automatically replicating data between isolated M3DB clusters (potentially running in different zones / regions). More details can be found in the Replication between clusters operational guide.
Metrics
It is best to use Prometheus to monitor M3DB, M3 Coordinator and M3 Query using the Grafana dashboards. Logs Logs are printed to process output in JSON by default for semi-structured log processing. Tracing M3DB is integrated with opentracing to provide insight into query performance and errors. Jaeger To enable Jaeger as the tracing backend, set tracing.backend to “jaeger” (see also our sample local config: tracing: backend: jaeger # enables jaeger with default configs jaeger: # optional configuration for jaeger — see # https://github.
Upgrading M3
Overview This guide explains how to upgrade M3 from one version to another (e.g. from 0.14.0 to 0.15.0). This includes upgrading: m3dbnode m3coordinator m3query m3aggregator m3dbnode Graphs to monitor While upgrading M3DB nodes, it’s important to monitor the status of bootstrapping the individual nodes. This can be monitored using the M3DB Node Details dashboard. Typically, the Bootstrapped graph under Background Tasks and the graphs within the CPU and Memory Utilization give a good understanding of how well bootstrapping is going.
Resource Limits and Preventing Abusive Reads/Writes
This operational guide provides an overview of how to set resource limits on M3 components to prevent abusive reads/writes impacting availability or performance of M3 in a production environment. M3DB Configuring limits The best way to get started protecting M3DB nodes is to set a few resource limits on the top level limits config stanza for M3DB. The primary limit is on total bytes recently read from disk across all queries since this most directly causes memory pressure.
Tuning Availability, Consistency, and Durability
M3DB is designed as a High Availability HA system because it doesn’t use a consensus protocol like Raft or Paxos to enforce strong consensus and consistency guarantees. However, even within the category of HA systems, there is a broad spectrum of consistency and durability guarantees that a database can provide. To address as many use cases as possible, M3DB can be tuned to achieve the desired balance between performance, availability, durability, and consistency.
Placement
Note: The words placement and topology are used interchangeably throughout the M3DB documentation and codebase. A M3DB cluster has exactly one Placement. That placement maps the cluster’s shard replicas to nodes. A cluster also has 0 or more namespaces (analogous to tables in other databases), and each node serves every namespace for the shards it owns. In other words, if the cluster topology states that node A owns shards 1, 2, and 3 then node A will own shards 1, 2, 3 for all configured namespaces in the cluster.
Placement Configuration
M3DB was designed from the ground up to be a distributed (clustered) database that is availability zone or rack aware (by using isolation groups). Clusters will seamlessly scale with your data, and you can start with a small number of nodes and grow it to a size of several hundred nodes with no downtime or expensive migrations. Before reading the rest of this document, we recommend familiarizing yourself with the M3DB placement documentation.
Namespace Configuration
Namespaces in M3DB are analogous to tables in other databases. Each namespace has a unique name as well as distinct configuration with regards to data retention and blocksize. For more information about namespaces and the technical details of their implementation, read our storage engine documentation. Namespace Operations The operations below include sample cURLs, but you can always review the API documentation by navigating to http://<M3\_COORDINATOR\_HOST\_NAME>:<CONFIGURED\_PORT(default 7201)>/api/v1/openapi or our online API documentation.
Bootstrapping & Crash Recovery
Introduction We recommend reading the placement operational guide before reading the rest of this document. When an M3DB node is turned on (goes through a placement change) it needs to go through a bootstrapping process to determine the integrity of data that it has, replay writes from the commit log, and/or stream missing data from its peers. In most cases, as long as you’re running with the default and recommended bootstrapper configuration of: filesystem,commitlog,peers,uninitialized_topology then you should not need to worry about the bootstrapping process at all and M3DB will take care of doing the right thing such that you don’t lose data and consistency guarantees are met.
Docker & Kernel Configuration
This document lists the Kernel tweaks M3DB needs to run well. If you are running on Kubernetes, you may use our sysctl-setter DaemonSet that will set these values for you. Please read the comment in that manifest to understand the implications of applying it. Running with Docker When running M3DB inside Docker, it is recommended to add the SYS_RESOURCE capability to the container (using the —cap-add argument to docker run) so that it can raise its file limits:
etcd
The M3 stack leverages etcd as a distributed key-value storage to: Update cluster configuration in realtime Manage placements for our distributed / sharded tiers like M3DB and M3Aggregator Perform leader-election in M3Aggregator and much more! Overview M3DB ships with support for running embedded etcd (called seed nodes), and while this is convenient for testing and development, we don’t recommend running with this setup in production. Both M3 and etcd are complex distributed systems, and trying to operate both within the same binary is challenging and dangerous for production workloads.
Mapping Rules
Mapping rules are used to configure the storage policy for metrics. The storage policy determines how long to store metrics for and at what resolution to keep them at. For example, a storage policy of 1m:48h tells M3 to keep the metrics for 48hrs at a 1min resolution. Mapping rules can be configured in the m3coordinator configuration file under the downsample > rules > mappingRules stanza. We will use the following as an example.
Background Repairs
Note: This feature is in beta and only available for use with M3DB when run with the inverted index off. It can be run with the inverted index on however metrics will not be re-indexed if they are repaired so will be invisible to that node for queries. Overview Background repairs enable M3DB to eventually reach a consistent state such that all nodes have identical view. An M3DB cluster can be configured to repair itself in the background.
Replication between clusters
M3DB clusters can be configured to passively replicate data from other clusters. This feature is most commonly used when operators wish to run two (or more) regional clusters that function independently while passively replicating data from the other cluster in an eventually consistent manner. The cross-cluster replication feature is built on-top of the background repairs feature. As a result, it has all the same caveats and limitations. Specifically, it does not currently work with clusters that use M3DB’s indexing feature and the replication delay between two clusters will be at least (block size + bufferPast) for data written at the beginning of a block for a given namespace.
Fileset Migrations
Occasionally, changes will be made to the format of fileset files on disk. When those changes need to be applied to already existing filesets, a fileset migration is required. Migrating existing filesets is beneficial so that improvements made in newer releases can be applied to all filesets, not just newly created ones. Migration Process Migrations are executed during the initial stages of the bootstrap. When enabled, the filesystem bootstrapper will scan for filesets that should be migrated and migrate any filesets found.