Alarm

EMQX Broker has built-in monitoring and alarm functions. Currently, it supports monitoring of CPU occupancy, (system, process) memory occupancy, number of processes, rule engine resource status, cluster partition and healing and it can alarm. Both activation and deactivation of alarms will generate a alarm log and the Broker will publish an MQTT message with the topic of $SYS/brokers/<Node>/alarms/activate or $SYS/brokers/<Node>/alarms/deactivate, Users can subscribe to the topics of $SYS/brokers/+/alarms/avtivate and $SYS/brokers/+/alarms/deactivate to get alarm notifications.

The Payload of the alarm notification message is in Json format and contains the following fields:

FieldTypeDescription
namestringAlarm name
detailsobjectAlarm details
messagestringHuman-readable alarm instructions
activate_atintegerThe time to activate the alarm, UNIX timestamp in microseconds
deactivate_atinteger / stringThe time to deactivate the alarm, the UNIX timestamp in microseconds. The value of this field for the activated alarm is infinity
activatedbooleanWhether the alarm is activated

Taking the alarm of high system memory usage as an example, you will receive a message in the following format:

The alarm will not be generated repeatedly. That is to say, if the high CPU usage alarm has been activated, the same alarm will not appear during its activation. The alarm will be automatically deactivated when the monitored item returns to normal. However, it also supports manual deactivation by the user (if the user clearly does not care about the alarm). Users can view current alarms (activated alarms) and historical alarms (deactivated alarms) on the Dashboard, and they can also use the HTTP API provided by EMQX Broker to Query and manage alarms.

EMQX Broker allows users to adjust the alarm function to a certain extent to meet actual needs. The following configuration items are currently opened:

Configuration itemTypeDefault valueDescription
os_mon.cpu_check_intervalduration60sCheck interval for CPU usage
os_mon.cpu_high_watermarkpercent80%The high watermark of the CPU usage, which is the threshold to activate the alarm
os_mon.cpu_low_watermarkpercent60%The low watermark of the CPU usage, which is the threshold to deactivate the alarm
os_mon.mem_check_intervalduration60%Check interval for memory usage
os_mon.sysmem_high_watermarkpercent70%The high water mark of the system memory usage. The alarm is activated when the total memory occupied by the application reaches this value
os_mon.procmem_high_watermarkpercent5%The high water mark of the process memory usage. The alarm will be activated when the memory occupied by a single process reaches this value.
vm_mon.check_intervalduration30sCheck interval for the number of processes
vm_mon.process_high_watermarkpercent80%The high watermark of the process occupancy rate, that is, the alarm is activated when the ratio of the number of created processes to the maximum number limit reaches this value
vm_mon.process_low_watermarkpercent60%The low water mark of the process occupancy rate, that is, the alarm is deactivated when the ratio of the number of created processes to the maximum number limit drops to this value
alarm.actionsstringlog,publishThe action triggered when the alarm is activated, and it currently only supports log and publish, which is to output log and publish system messages
alarm.size_limitinteger1000The maximum number of saved alarms that has been deactivated. After the limit is reached, these alarms will be cleared according to the FIFO principle
alarm.validity_periodduration24hThe maximum storage time of deactivated alarms, and expired alarms will be cleared

EMQX Enterprise raises an alarm when the license expires in less than 30 days, or if the number of connections exceeds the high watermark. User can adjust the high/low watermark for the number of connections according to the actual situation.

Configuration itemTypeDefault valueDescription
license.connection_high_watermark_alarmpercent80%The high water mark of license‘s max connection. The alarm is activated when this threshld is reached. As percentage of alive connections/max connections.
icense.connection_low_watermark_alarmpercent75%The low water mark of license‘s max connection. The alarm is deactivated when it goes below this threshld. As percentage of alive connections/max connections.