Monitor bootstrap
Terminology:
cluster
: a set of monitorsquorum
: an active set of monitors consisting of a majority of the cluster
In order to initialize a new monitor, it must always be fed:
a logical name
secret keys
a cluster fsid (uuid)
In addition, a monitor needs to know two things:
what address to bind to
who its peers are (if any)
There are a range of ways to do both.
Logical id
The logical id should be unique across the cluster. It will beappended to mon.
to logically describe the monitor in the Cephcluster. For example, if the logical id is foo
, the monitor’sname will be mon.foo
.
For most users, there is no more than one monitor per host, whichmakes the short hostname logical choice.
Secret keys
The mon.
secret key is stored a keyring
file in the mon data
directory. It can be generatedwith a command like:
- ceph-authtool --create-keyring /path/to/keyring --gen-key -n mon.
When creating a new monitor cluster, the keyring should also contain a client.admin
key that can be usedto administer the system:
- ceph-authtool /path/to/keyring --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow'
The resulting keyring is fed to ceph-mon —mkfs
with the —keyring <keyring>
command-line argument.
Cluster fsid
The cluster fsid is a normal uuid, like that generated by the uuidgen
command. Itcan be provided to the monitor in two ways:
via the
—fsid <uuid>
command-line argument (or config file option)via a monmap provided to the new monitor via the
—monmap <path>
command-line argument.
Monitor address
The monitor address can be provided in several ways.
via the
—public-addr <ip[:port]>
command-line option (or config file option)via the
—public-network <cidr>
command-line option (or config file option)via the monmap provided via
—monmap <path>
, if it includes a monitor with our namevia the bootstrap monmap (provided via
—inject-monmap <path>
or generated from—mon-host <list>
) if it includes a monitor with no name (noname-<something>
) and an address configured on the local host.
Peers
The monitor peers are provided in several ways:
via the initial monmap, provided via
—monmap <filename>
via the bootstrap monmap generated from
—mon-host <list>
via the bootstrap monmap generated from
[mon.*]
sections with the deprecatedmon addr
options in the config file (note that this method is not recommended and does not support binding to both v1 and v2 protocol addresses)dynamically via the admin socket
However, these methods are not completely interchangeable because ofthe complexity of creating a new monitor cluster without danger ofraces.
Cluster creation
There are three basic approaches to creating a cluster:
Create a new cluster by specifying the monitor names and addresses ahead of time.
Create a new cluster by specifying the monitor names ahead of time, and dynamically setting the addresses as
ceph-mon
daemons configure themselves.Create a new cluster by specifying the monitor addresses ahead of time.
Names and addresses
Generate a monmap using monmaptool
with the names and addresses of the initialmonitors. The generated monmap will also include a cluster fsid. Feed that monmapto each monitor daemon:
- ceph-mon --mkfs -i <name> --monmap <initial_monmap> --keyring <initial_keyring>
When the daemons start, they will know exactly who they and their peers are.
Addresses only
The initial monitor addresses can be specified with the mon host
configuration value,either via a config file or the command-line argument. This method has the advantage thata single global config file for the cluster can have a line like:
- mon host = a.foo.com, b.foo.com, c.foo.com
and will also serve to inform any ceph clients or daemons who the monitors are.
The ceph-mon
daemons will need to be fed the initial keyring and cluster fsid toinitialize themselves:
ceph-mon –mkfs -i <name> –fsid <uuid> –keyring <initial_keyring>
When the daemons first start up, they will share their names with each other and form anew cluster.
Names only
In dynamic “cloud” environments, the cluster creator may not (yet)know what the addresses of the monitors are going to be. Instead,they may want machines to configure and start themselves in paralleland, as they come up, form a new cluster on their own. The problem isthat the monitor cluster relies on strict majorities to keep itselfconsistent, and in order to “create” a new cluster, it needs to knowwhat the initial set of monitors will be.
This can be done with the mon initial members
config option, whichshould list the ids of the initial monitors that are allowed to createthe cluster:
- mon initial members = foo, bar, baz
The monitors can then be initialized by providing the other pieces ofinformation (they keyring, cluster fsid, and a way of determiningtheir own address). For example:
- ceph-mon --mkfs -i <name> --mon-initial-hosts 'foo,bar,baz' --keyring <initial_keyring> --public-addr <ip>
When these daemons are started, they will know their own address, butnot their peers. They can learn those addresses via the admin socket:
- ceph daemon mon.<id> add_bootstrap_peer_hint <peer ip>
Once they learn enough of their peers from the initial member set,they will be able to create the cluster.
Cluster expansion
Cluster expansion is slightly less demanding than creation, becausethe creation of the initial quorum is not an issue and there is noworry about creating separately independent clusters.
New nodes can be forced to join an existing cluster in two ways:
by providing no initial monitor peers addresses, and feeding them dynamically.
by specifying the
mon initial members
config option to prevent the new nodes from forming a new, independent cluster, and feeding some existing monitors via any available method.
Initially peerless expansion
Create a new monitor and give it no peer addresses other than its own. Forexample:
- ceph-mon --mkfs -i <myid> --fsid <fsid> --keyring <mon secret key> --public-addr <ip>
Once the daemon starts, you can give it one or more peer addresses (preferably a bare IP address with no port; the mon will set the addr types and ports for you) to join with:
- ceph daemon mon.<id> add_bootstrap_peer_hint <peer ip>
Alternatively, you can explicitly specify the addrvec_t with:
- ceph daemon mon.<id> add_bootstrap_peer_hintv <peer addrvec>
For example,:
- ceph daemon mon.new add_bootstrap_peer_hintv v2:1.2.3.4:3300,v1:1.2.3.4:6789
This monitor will never participate in cluster creation; it can onlyjoin an existing cluster.
Note that the address(es) specified should match exactly the addressesthe new monitor is binding too. If, for example, the new mon binds toonly a v2 address but a v2 and v1 address are provided, there is somepossibility of confusion in the mons.
Expanding with initial members
You can feed the new monitor some peer addresses initially and avoid badness by alsosetting mon initial members
. For example:
- ceph-mon --mkfs -i <myid> --fsid <fsid> --keyring <mon secret key> --public-addr <ip> --mon-host foo,bar,baz
When the daemon is started, mon initial members
must be set via the command line or config file:
- ceph-mon -i <myid> --mon-initial-members foo,bar,baz
to prevent any risk of split-brain.