RocketMQ 消息队列的运维

在前两节,我们讨论了RabbitMQ。

然而根据实际的生产经验来看,当系统瞬时流量达到一定规模时,上述两款产品都不再适合作为消息系统的首选。

RabbitMQ在企业级应用是没有问题的,但它的抗消息堆积能力非常差,一旦突发流量提高,发生事件堆积,整个RabbitMQ集群就会挂起拒绝接受新消息,在极端情况下,甚至会发生RabbitMQ集群的宕机和消息丢失。

RocketMQ是一款高性能的分布式消息队列,它借鉴了Kafka的设计思路并继承了其高吞吐的特点,并重点改进了延迟,使得其更加适用于消息的实时处理。根据官方评测,在主流服务器上,RocketMQ的处理性能可达7万/秒,且随着Topic数量(队列数目)的增长而基本保持稳定,比Kafka更为稳定[^1]。

在本小节,我们将讨论RocketMQ的运维,一般有两种方案。

  • RocketMQ部署在多台物理机上,优点是性能可靠。可以参考官方集群部署文档
  • RocketMQ部署在容器集群上,优点是运维方便。本小节将主要介绍这种方案。

RocketMQ服务器有两种角色:

  • NameServer: 管理元数据和Broker服务器、客户端的连接入口
  • Broker: 处理消息队列的服务器

在本小节,我们将构建4台服务器的RocketMQ集群,2台NameServer、2台Broker。

首先构建4个PersistentVolume,给上述4台机器使用:

  1. apiVersion: v1
  2. kind: PersistentVolume
  3. metadata:
  4. name: pv011
  5. spec:
  6. storageClassName: standard
  7. accessModes:
  8. - ReadWriteOnce
  9. capacity:
  10. storage: 20Gi
  11. hostPath:
  12. path: /data/pv011/
  13. ---
  14. apiVersion: v1
  15. kind: PersistentVolume
  16. metadata:
  17. name: pv012
  18. spec:
  19. storageClassName: standard
  20. accessModes:
  21. - ReadWriteOnce
  22. capacity:
  23. storage: 20Gi
  24. hostPath:
  25. path: /data/pv012/
  26. ---
  27. apiVersion: v1
  28. kind: PersistentVolume
  29. metadata:
  30. name: pv021
  31. spec:
  32. storageClassName: standard
  33. accessModes:
  34. - ReadWriteOnce
  35. capacity:
  36. storage: 20Gi
  37. hostPath:
  38. path: /data/pv021/
  39. ---
  40. apiVersion: v1
  41. kind: PersistentVolume
  42. metadata:
  43. name: pv022
  44. spec:
  45. storageClassName: standard
  46. accessModes:
  47. - ReadWriteOnce
  48. capacity:
  49. storage: 20Gi
  50. hostPath:
  51. path: /data/pv022/

应用4个持久化目录:

  1. kubectl -f ./pvs.yaml

接着,看一下NameServer的部署:

  1. apiVersion: v1
  2. kind: Service
  3. metadata:
  4. name: rn
  5. spec:
  6. ports:
  7. - port: 9876
  8. selector:
  9. app: rocketmq-nameserver
  10. clusterIP: None
  11. ---
  12. apiVersion: apps/v1
  13. kind: StatefulSet
  14. metadata:
  15. name: rocketmq-nameserver
  16. spec:
  17. selector:
  18. matchLabels:
  19. app: rocketmq-nameserver
  20. serviceName: "rn"
  21. replicas: 2
  22. template:
  23. metadata:
  24. labels:
  25. app: rocketmq-nameserver
  26. spec:
  27. restartPolicy: Always
  28. hostname: rocketmq-nameserver
  29. containers:
  30. - name: rocketmq-nameserver-ct
  31. imagePullPolicy: Never
  32. image: coder4/rocketmq:4.2.0
  33. ports:
  34. - containerPort: 9876
  35. volumeMounts:
  36. - mountPath: /opt/rocketmq_home
  37. name: rocketmq-nameserver-pvc
  38. env:
  39. - name: NAME_SERVER
  40. value: "true"
  41. volumeClaimTemplates:
  42. - metadata:
  43. name: rocketmq-nameserver-pvc
  44. spec:
  45. storageClassName: standard
  46. accessModes:
  47. - ReadWriteOnce
  48. resources:
  49. requests:
  50. storage: 20Gi

如上所示:

  • 我们使用了定制镜像coder4/rocketmq,它集成了NameServer/Broker并支持集群部署
  • 使用StatefulSet部署两台相互独立的NameServer,主机名分别为rocketmq-nameserver-0和rocketmq-nameserver-1
  • Volume的挂在点是/opt/rocketmq_home,其中会包含data和log两个子目录。

启动一下,稍后成功:

  1. kubectl apply -f ./nameserver-service.yaml
  2. kubectl get pods
  3. NAME READY STATUS RESTARTS AGE
  4. rocketmq-nameserver-0 1/1 Running 0 2m
  5. rocketmq-nameserver-1 1/1 Running 0 2m

接下来,我们看一下Broker。针对Broker,官方提供了几种高可用方案,我们这里采用”two master no slave”的模式,更多模式可参考官方文档。

  1. piVersion: v1
  2. kind: Service
  3. metadata:
  4. name: rb
  5. spec:
  6. ports:
  7. - name: p9
  8. port: 10909
  9. - name: p11
  10. port: 10911
  11. selector:
  12. app: rocketmq-brokerserver
  13. clusterIP: None
  14. ---
  15. apiVersion: apps/v1
  16. kind: StatefulSet
  17. metadata:
  18. name: rocketmq-brokerserver
  19. spec:
  20. selector:
  21. matchLabels:
  22. app: rocketmq-brokerserver
  23. serviceName: "rb"
  24. replicas: 2
  25. template:
  26. metadata:
  27. labels:
  28. app: rocketmq-brokerserver
  29. spec:
  30. restartPolicy: Always
  31. hostname: rocketmq-brokerserver
  32. containers:
  33. - name: rocketmq-brokerserver-ct
  34. imagePullPolicy: Never
  35. image: rocketmq:latest
  36. ports:
  37. - containerPort: 10909
  38. - containerPort: 10911
  39. volumeMounts:
  40. - mountPath: /opt/rocketmq_home
  41. name: rocketmq-brokerserver-pvc
  42. env:
  43. - name: NAME_SERVER_LIST
  44. value: "rocketmq-nameserver-0.rn:9876;rocketmq-nameserver-1.rn:9876"
  45. - name: BROKER_SERVER
  46. value: "true"
  47. - name: BROKER_CLUSTER_CONF
  48. value: "2m-noslave"
  49. volumeClaimTemplates:
  50. - metadata:
  51. name: rocketmq-brokerserver-pvc
  52. spec:
  53. storageClassName: standard
  54. accessModes:
  55. - ReadWriteOnce
  56. resources:
  57. requests:
  58. storage: 20Gi

上述brokerserver的配置与nameserver存在如下区别:

  • 通过环境变量NAME_SERVER_LIST设定了nameServer的集群列表,即之前启动的两台机器。
  • BROKER_SERVER表示启用的是broker server模式
  • BROKER_CLUSTER_CONF表示集群配置模式,即我们提到的”双主零从模式”

我们也启动一下broker server,稍等一会后,会成功:

  1. kubectl apply -f ./broker-service.yaml
  2. kubectl get pods
  3. NAME READY STATUS RESTARTS AGE
  4. rocketmq-brokerserver-0 1/1 Running 0 59s
  5. rocketmq-brokerserver-1 0/1 Running 0 49s

我们尝试用RocketMQ的自带工具n查看一下broker集群状态,能发现两台机器,说明集群部署成功:

  1. ./mqadmin clusterList -n "rocketmq-nameserver-0.rn:9876;rocketmq-nameserver-1.rn:9876"
  2. #Cluster Name #Broker Name #BID #Addr #Version #InTPS(LOAD) #OutTPS(LOAD) #PCWait(ms) #Hour #SPACE
  3. DefaultCluster broker-a 0 172.17.0.15:10911 V4_2_0_SNAPSHOT 0.00(0,0ms) 0.00(0,0ms) 0 425363.14 -1.0000
  4. DefaultCluster broker-b 0 172.17.0.16:10911 V4_2_0_SNAPSHOT 0.00(0,0ms) 0.00(0,0ms) 0 425363.14 -1.0000

至此,我们完成了RocketMQ的集群部署工作。

[^1]: Kafka vs. RocketMQ- Multiple Topic Stress Test Results