Bringing up a Kafka-based Ordering Service
Caveat emptor
This document assumes that the reader knows how to set up a Kafka cluster and a ZooKeeper ensemble, and keep them secure for general usage by preventing unauthorized access. The sole purpose of this guide is to identify the steps you need to take so as to have a set of Hyperledger Fabric ordering service nodes (OSNs) use your Kafka cluster and provide an ordering service to your blockchain network.
For information about the role orderers play in a network and in a transaction flow, checkout our The Ordering Service documentation.
For information on how to set up an ordering node, check out our Planning for an ordering service documentation.
For information about configuring Raft ordering services, check out Configuring and operating a Raft ordering service.
Big picture
Each channel maps to a separate single-partition topic in Kafka. When an OSN receives transactions via the Broadcast
RPC, it checks to make sure that the broadcasting client has permissions to write on the channel, then relays (i.e. produces) those transactions to the appropriate partition in Kafka. This partition is also consumed by the OSN which groups the received transactions into blocks locally, persists them in its local ledger, and serves them to receiving clients via the Deliver
RPC. For low-level details, refer to the document that describes how we came to this design. Figure 8 is a schematic representation of the process described above.
Steps
Let K
and Z
be the number of nodes in the Kafka cluster and the ZooKeeper ensemble respectively:
At a minimum,
K
should be set to 4. (As we will explain in Step 4 below, this is the minimum number of nodes necessary in order to exhibit crash fault tolerance, i.e. with 4 brokers, you can have 1 broker go down, all channels will continue to be writeable and readable, and new channels can be created.)Z
will either be 3, 5, or 7. It has to be an odd number to avoid split-brain scenarios, and larger than 1 in order to avoid single point of failures. Anything beyond 7 ZooKeeper servers is considered overkill.
Then proceed as follows:
Orderers: Encode the Kafka-related information in the network’s genesis block. If you are using
configtxgen
, editconfigtx.yaml
. Alternatively, pick a preset profile for the system channel’s genesis block— so that:Orderer.OrdererType
is set tokafka
.Orderer.Kafka.Brokers
contains the address of at least two of the Kafka brokers in your cluster inIP:port
notation. The list does not need to be exhaustive. (These are your bootstrap brokers.)
Orderers: Set the maximum block size. Each block will have at most
Orderer.AbsoluteMaxBytes
bytes (not including headers), a value that you can set inconfigtx.yaml
. Let the value you pick here beA
and make note of it —– it will affect how you configure your Kafka brokers in Step 6.Orderers: Create the genesis block. Use
configtxgen
. The settings you picked in Steps 3 and 4 above are system-wide settings, i.e. they apply across the network for all the OSNs. Make note of the genesis block’s location.Kafka cluster: Configure your Kafka brokers appropriately. Ensure that every Kafka broker has these keys configured:
unclean.leader.election.enable = false
— Data consistency is key in a blockchain environment. We cannot have a channel leader chosen outside of the in-sync replica set, or we run the risk of overwriting the offsets that the previous leader produced, and —as a result— rewrite the blockchain that the orderers produce.min.insync.replicas = M
— Where you pick a valueM
such that1 < M < N
(seedefault.replication.factor
below). Data is considered committed when it is written to at leastM
replicas (which are then considered in-sync and belong to the in-sync replica set, or ISR). In any other case, the write operation returns an error. Then:If up to
N-M
replicas —out of theN
that the channel data is written to become unavailable, operations proceed normally.If more replicas become unavailable, Kafka cannot maintain an ISR set of
M,
so it stops accepting writes. Reads work without issues. The channel becomes writeable again whenM
replicas get in-sync.
default.replication.factor = N
— Where you pick a valueN
such thatN < K
. A replication factor ofN
means that each channel will have its data replicated toN
brokers. These are the candidates for the ISR set of a channel. As we noted in themin.insync.replicas section
above, not all of these brokers have to be available all the time.N
should be set strictly smaller toK
because channel creations cannot go forward if less thanN
brokers are up. So if you setN = K
, a single broker going down means that no new channels can be created on the blockchain network — the crash fault tolerance of the ordering service is non-existent.Based on what we’ve described above, the minimum allowed values for
M
andN
are 2 and 3 respectively. This configuration allows for the creation of new channels to go forward, and for all channels to continue to be writeable.message.max.bytes
andreplica.fetch.max.bytes
should be set to a value larger thanA
, the value you picked inOrderer.AbsoluteMaxBytes
in Step 4 above. Add some buffer to account for headers —– 1 MiB is more than enough. The following condition applies:Orderer.AbsoluteMaxBytes < replica.fetch.max.bytes <= message.max.bytes
(For completeness, we note that
message.max.bytes
should be strictly smaller tosocket.request.max.bytes
which is set by default to 100 MiB. If you wish to have blocks larger than 100 MiB you will need to edit the hard-coded value inbrokerConfig.Producer.MaxMessageBytes
infabric/orderer/kafka/config.go
and rebuild the binary from source. This is not advisable.)log.retention.ms = -1
. Until the ordering service adds support for pruning of the Kafka logs, you should disable time-based retention and prevent segments from expiring. (Size-based retention — seelog.retention.bytes
— is disabled by default in Kafka at the time of this writing, so there’s no need to set it explicitly.)
Orderers: Point each OSN to the genesis block. Edit
General.BootstrapFile
inorderer.yaml
so that it points to the genesis block created in Step 5 above. While at it, ensure all other keys in that YAML file are set appropriately.Orderers: Adjust polling intervals and timeouts. (Optional step.)
The
Kafka.Retry
section in theorderer.yaml
file allows you to adjust the frequency of the metadata/producer/consumer requests, as well as the socket timeouts. (These are all settings you would expect to see in a Kafka producer or consumer.)Additionally, when a new channel is created, or when an existing channel is reloaded (in case of a just-restarted orderer), the orderer interacts with the Kafka cluster in the following ways:
It creates a Kafka producer (writer) for the Kafka partition that corresponds to the channel. . It uses that producer to post a no-op
CONNECT
message to that partition. . It creates a Kafka consumer (reader) for that partition.If any of these steps fail, you can adjust the frequency with which they are repeated. Specifically they will be re-attempted every
Kafka.Retry.ShortInterval
for a total ofKafka.Retry.ShortTotal
, and then everyKafka.Retry.LongInterval
for a total ofKafka.Retry.LongTotal
until they succeed. Note that the orderer will be unable to write to or read from a channel until all of the steps above have been completed successfully.
Set up the OSNs and Kafka cluster so that they communicate over SSL. (Optional step, but highly recommended.) Refer to the Confluent guide for the Kafka cluster side of the equation, and set the keys under
Kafka.TLS
inorderer.yaml
on every OSN accordingly.Bring up the nodes in the following order: ZooKeeper ensemble, Kafka cluster, ordering service nodes.
Additional considerations
Preferred message size. In Step 4 above (see Steps section) you can also set the preferred size of blocks by setting the
Orderer.Batchsize.PreferredMaxBytes
key. Kafka offers higher throughput when dealing with relatively small messages; aim for a value no bigger than 1 MiB.Using environment variables to override settings. When using the sample Kafka and Zookeeper Docker images provided with Fabric (see
images/kafka
andimages/zookeeper
respectively), you can override a Kafka broker or a ZooKeeper server’s settings by using environment variables. Replace the dots of the configuration key with underscores. For example,KAFKA_UNCLEAN_LEADER_ELECTION_ENABLE=false
will allow you to override the default value ofunclean.leader.election.enable
. The same applies to the OSNs for their local configuration, i.e. what can be set inorderer.yaml
. For exampleORDERER_KAFKA_RETRY_SHORTINTERVAL=1s
allows you to override the default value forOrderer.Kafka.Retry.ShortInterval
.
Kafka Protocol Version Compatibility
Fabric uses the sarama client library and vendors a version of it that supports Kafka 0.10 to 1.0, yet is still known to work with older versions.
Using the Kafka.Version
key in orderer.yaml
, you can configure which version of the Kafka protocol is used to communicate with the Kafka cluster’s brokers. Kafka brokers are backward compatible with older protocol versions. Because of a Kafka broker’s backward compatibility with older protocol versions, upgrading your Kafka brokers to a new version does not require an update of the Kafka.Version
key value, but the Kafka cluster might suffer a performance penalty while using an older protocol version.
Debugging
Set environment variable FABRIC_LOGGING_SPEC
to DEBUG
and set Kafka.Verbose
to true
in orderer.yaml
.