- Monitoring bare-metal events with the Bare Metal Event Relay
- About bare-metal events
- How bare-metal events work
- Installing the AMQ messaging bus
- Subscribing to Redfish BMC bare-metal events for a cluster node
- Subscribing applications to bare-metal events REST API reference
Monitoring bare-metal events with the Bare Metal Event Relay
Bare Metal Event Relay is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. For more information about the support scope of Red Hat Technology Preview features, see https://access.redhat.com/support/offerings/techpreview/. |
About bare-metal events
Use the Bare Metal Event Relay to subscribe applications that run in your OKD cluster to events that are generated on the underlying bare-metal host. The Redfish service publishes events on a node and transmits them on an advanced message queue to subscribed applications.
Bare-metal events are based on the open Redfish standard that is developed under the guidance of the Distributed Management Task Force (DMTF). Redfish provides a secure industry-standard protocol with a REST API. The protocol is used for the management of distributed, converged or software-defined resources and infrastructure.
Hardware-related events published through Redfish includes:
Breaches of temperature limits
Server status
Fan status
Begin using bare-metal events by deploying the Bare Metal Event Relay Operator and subscribing your application to the service. The Bare Metal Event Relay Operator installs and manages the lifecycle of the Redfish bare-metal event service.
The Bare Metal Event Relay works only with Redfish-capable devices on single-node clusters provisioned on bare-metal infrastructure. |
How bare-metal events work
The Bare Metal Event Relay enables applications running on bare-metal clusters to respond quickly to Redfish hardware changes and failures such as breaches of temperature thresholds, fan failure, disk loss, power outages, and memory failure. These hardware events are delivered over a reliable low-latency transport channel based on Advanced Message Queuing Protocol (AMQP). The latency of the messaging service is between 10 to 20 milliseconds.
The Bare Metal Event Relay provides a publish-subscribe service for the hardware events, where multiple applications can use REST APIs to subscribe and consume the events. The Bare Metal Event Relay supports hardware that complies with Redfish OpenAPI v1.8 or higher.
Bare Metal Event Relay data flow
The following figure illustrates an example of bare-metal events data flow:
Figure 1. Bare Metal Event Relay data flow
Operator-managed pod
The Operator uses custom resources to manage the pod containing the Bare Metal Event Relay and its components using the HardwareEvent
CR.
Bare Metal Event Relay
At startup, the Bare Metal Event Relay queries the Redfish API and downloads all the message registries, including custom registries. The Bare Metal Event Relay then begins to receive subscribed events from the Redfish hardware.
The Bare Metal Event Relay enables applications running on bare-metal clusters to respond quickly to Redfish hardware changes and failures such as breaches of temperature thresholds, fan failure, disk loss, power outages, and memory failure. The events are reported using the HardwareEvent
CR.
Cloud native event
Cloud native events (CNE) is a REST API specification for defining the format of event data.
CNCF CloudEvents
CloudEvents is a vendor-neutral specification developed by the Cloud Native Computing Foundation (CNCF) for defining the format of event data.
AMQP dispatch router
The dispatch router is responsible for the message delivery service between publisher and subscriber. AMQP 1.0 qpid is an open standard that supports reliable, high-performance, fully-symmetrical messaging over the internet.
Cloud event proxy sidecar
The cloud event proxy sidecar container image is based on the ORAN API specification and provides a publish-subscribe event framework for hardware events.
Redfish message parsing service
In addition to handling Redfish events, the Bare Metal Event Relay provides message parsing for events without a Message
property. The proxy downloads all the Redfish message registries including vendor specific registries from the hardware when it starts. If an event does not contain a Message
property, the proxy uses the Redfish message registries to construct the Message
and Resolution
properties and add them to the event before passing the event to the cloud events framework. This service allows Redfish events to have smaller message size and lower transmission latency.
Installing the Bare Metal Event Relay using the CLI
As a cluster administrator, you can install the Bare Metal Event Relay Operator by using the CLI.
Prerequisites
A cluster that is installed on bare-metal hardware with nodes that have a RedFish-enabled Baseboard Management Controller (BMC).
Install the OpenShift CLI (
oc
).Log in as a user with
cluster-admin
privileges.
Procedure
Create a namespace for the Bare Metal Event Relay.
Save the following YAML in the
bare-metal-events-namespace.yaml
file:apiVersion: v1
kind: Namespace
metadata:
name: openshift-bare-metal-events
labels:
name: openshift-bare-metal-events
openshift.io/cluster-monitoring: "true"
Create the
Namespace
CR:$ oc create -f bare-metal-events-namespace.yaml
Create an Operator group for the Bare Metal Event Relay Operator.
Save the following YAML in the
bare-metal-events-operatorgroup.yaml
file:apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: bare-metal-event-relay-group
namespace: openshift-bare-metal-events
spec:
targetNamespaces:
- openshift-bare-metal-events
Create the
OperatorGroup
CR:$ oc create -f bare-metal-events-operatorgroup.yaml
Subscribe to the Bare Metal Event Relay.
Save the following YAML in the
bare-metal-events-sub.yaml
file:apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: bare-metal-event-relay-subscription
namespace: openshift-bare-metal-events
spec:
channel: "stable"
name: bare-metal-event-relay
source: redhat-operators
sourceNamespace: openshift-marketplace
Create the
Subscription
CR:$ oc create -f bare-metal-events-sub.yaml
Verification
To verify that the Bare Metal Event Relay Operator is installed, run the following command:
$ oc get csv -n openshift-bare-metal-events -o custom-columns=Name:.metadata.name,Phase:.status.phase
Example output
Name Phase
bare-metal-event-relay.4.10.0-202206301927 Succeeded
Installing the Bare Metal Event Relay using the web console
As a cluster administrator, you can install the Bare Metal Event Relay Operator using the web console.
Prerequisites
A cluster that is installed on bare-metal hardware with nodes that have a RedFish-enabled Baseboard Management Controller (BMC).
Log in as a user with
cluster-admin
privileges.
Procedure
Install the Bare Metal Event Relay using the OKD web console:
In the OKD web console, click Operators → OperatorHub.
Choose Bare Metal Event Relay from the list of available Operators, and then click Install.
On the Install Operator page, select or create a Namespace, select openshift-bare-metal-events, and then click Install.
Verification
Optional: You can verify that the Operator installed successfully by performing the following check:
Switch to the Operators → Installed Operators page.
Ensure that Bare Metal Event Relay is listed in the project with a Status of InstallSucceeded.
During installation an Operator might display a Failed status. If the installation later succeeds with an InstallSucceeded message, you can ignore the Failed message.
If the operator does not appear as installed, to troubleshoot further:
Go to the Operators → Installed Operators page and inspect the Operator Subscriptions and Install Plans tabs for any failure or errors under Status.
Go to the Workloads → Pods page and check the logs for pods in the project namespace.
Installing the AMQ messaging bus
To pass Redfish bare-metal event notifications between publisher and subscriber on a node, you must install and configure an AMQ messaging bus to run locally on the node. You do this by installing the AMQ Interconnect Operator for use in the cluster.
Prerequisites
Install the OKD CLI (
oc
).Log in as a user with
cluster-admin
privileges.
Procedure
- Install the AMQ Interconnect Operator to its own
amq-interconnect
namespace. See Installing the AMQ Interconnect Operator.
Verification
Verify that the AMQ Interconnect Operator is available and the required pods are running:
$ oc get pods -n amq-interconnect
Example output
NAME READY STATUS RESTARTS AGE
amq-interconnect-645db76c76-k8ghs 1/1 Running 0 23h
interconnect-operator-5cb5fc7cc-4v7qm 1/1 Running 0 23h
Verify that the required
bare-metal-event-relay
bare-metal event producer pod is running in theopenshift-bare-metal-events
namespace:$ oc get pods -n openshift-bare-metal-events
Example output
NAME READY STATUS RESTARTS AGE
hw-event-proxy-operator-controller-manager-74d5649b7c-dzgtl 2/2 Running 0 25s
Subscribing to Redfish BMC bare-metal events for a cluster node
As a cluster administrator, you can subscribe to Redfish BMC events generated on a node in your cluster by creating a BMCEventSubscription
custom resource (CR) for the node, creating a HardwareEvent
CR for the event, and a Secret
CR for the BMC.
Subscribing to bare-metal events
You can configure the baseboard management controller (BMC) to send bare-metal events to subscribed applications running in an OKD cluster. Example Redfish bare-metal events include an increase in device temperature, or removal of a device. You subscribe applications to bare-metal events using a REST API.
You can only create a |
Use the |
Perform the following procedure to subscribe to bare-metal events for the node using a BMCEventSubscription
CR.
Prerequisites
Install the OpenShift CLI (
oc
).Log in as a user with
cluster-admin
privileges.Get the user name and password for the BMC.
Deploy a bare-metal node with a Redfish-enabled Baseboard Management Controller (BMC) in your cluster, and enable Redfish events on the BMC.
Enabling Redfish events on specific hardware is outside the scope of this information. For more information about enabling Redfish events for your specific hardware, consult the BMC manufacturer documentation.
Procedure
Confirm that the node hardware has the Redfish
EventService
enabled by running the followingcurl
command:curl https://<bmc_ip_address>/redfish/v1/EventService --insecure -H 'Content-Type: application/json' -u "<bmc_username>:<password>"
where:
bmc_ip_address
is the IP address of the BMC where the Redfish events are generated.
Example output
{
"@odata.context": "/redfish/v1/$metadata#EventService.EventService",
"@odata.id": "/redfish/v1/EventService",
"@odata.type": "#EventService.v1_0_2.EventService",
"Actions": {
"#EventService.SubmitTestEvent": {
"EventType@Redfish.AllowableValues": ["StatusChange", "ResourceUpdated", "ResourceAdded", "ResourceRemoved", "Alert"],
"target": "/redfish/v1/EventService/Actions/EventService.SubmitTestEvent"
}
},
"DeliveryRetryAttempts": 3,
"DeliveryRetryIntervalSeconds": 30,
"Description": "Event Service represents the properties for the service",
"EventTypesForSubscription": ["StatusChange", "ResourceUpdated", "ResourceAdded", "ResourceRemoved", "Alert"],
"EventTypesForSubscription@odata.count": 5,
"Id": "EventService",
"Name": "Event Service",
"ServiceEnabled": true,
"Status": {
"Health": "OK",
"HealthRollup": "OK",
"State": "Enabled"
},
"Subscriptions": {
"@odata.id": "/redfish/v1/EventService/Subscriptions"
}
}
Get the Bare Metal Event Relay service route for the cluster by running the following command:
$ oc get route -n openshift-bare-metal-events
Example output
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
hw-event-proxy hw-event-proxy-openshift-bare-metal-events.apps.compute-1.example.com hw-event-proxy-service 9087 edge None
Create a
BMCEventSubscription
resource to subscribe to the Redfish events:Save the following YAML in the
bmc_sub.yaml
file:apiVersion: metal3.io/v1alpha1
kind: BMCEventSubscription
metadata:
name: sub-01
namespace: openshift-machine-api
spec:
hostName: <hostname> (1)
destination: <proxy_service_url> (2)
context: ''
1 Specifies the name or UUID of the worker node where the Redfish events are generated. 2 Specifies the bare-metal event proxy service, for example, https://hw-event-proxy-openshift-bare-metal-events.apps.compute-1.example.com/webhook
.Create the
BMCEventSubscription
CR:$ oc create -f bmc_sub.yaml
Optional: To delete the BMC event subscription, run the following command:
$ oc delete -f bmc_sub.yaml
Optional: To manually create a Redfish event subscription without creating a
BMCEventSubscription
CR, run the followingcurl
command, specifying the BMC username and password.$ curl -i -k -X POST -H "Content-Type: application/json" -d '{"Destination": "https://<proxy_service_url>", "Protocol" : "Redfish", "EventTypes": ["Alert"], "Context": "root"}' -u <bmc_username>:<password> 'https://<bmc_ip_address>/redfish/v1/EventService/Subscriptions' –v
where:
proxy_service_url
is the bare-metal event proxy service, for example,
[https://hw-event-proxy-openshift-bare-metal-events.apps.compute-1.example.com/webhook](https://hw-event-proxy-openshift-bare-metal-events.apps.compute-1.example.com/webhook)
.bmc_ip_address
is the IP address of the BMC where the Redfish events are generated.
Example output
HTTP/1.1 201 Created
Server: AMI MegaRAC Redfish Service
Location: /redfish/v1/EventService/Subscriptions/1
Allow: GET, POST
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: X-Auth-Token
Access-Control-Allow-Headers: X-Auth-Token
Access-Control-Allow-Credentials: true
Cache-Control: no-cache, must-revalidate
Link: <http://redfish.dmtf.org/schemas/v1/EventDestination.v1_6_0.json>; rel=describedby
Link: <http://redfish.dmtf.org/schemas/v1/EventDestination.v1_6_0.json>
Link: </redfish/v1/EventService/Subscriptions>; path=
ETag: "1651135676"
Content-Type: application/json; charset=UTF-8
OData-Version: 4.0
Content-Length: 614
Date: Thu, 28 Apr 2022 08:47:57 GMT
Querying Redfish bare-metal event subscriptions with curl
Some hardware vendors limit the amount of Redfish hardware event subscriptions. You can query the number of Redfish event subscriptions by using curl
.
Prerequisites
Get the user name and password for the BMC.
Deploy a bare-metal node with a Redfish-enabled Baseboard Management Controller (BMC) in your cluster, and enable Redfish hardware events on the BMC.
Procedure
Check the current subscriptions for the BMC by running the following
curl
command:$ curl --globoff -H "Content-Type: application/json" -k -X GET --user <bmc_username>:<password> https://<bmc_ip_address>/redfish/v1/EventService/Subscriptions
where:
bmc_ip_address
is the IP address of the BMC where the Redfish events are generated.
Example output
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 435 100 435 0 0 399 0 0:00:01 0:00:01 --:--:-- 399
{
"@odata.context": "/redfish/v1/$metadata#EventDestinationCollection.EventDestinationCollection",
"@odata.etag": ""
1651137375 "",
"@odata.id": "/redfish/v1/EventService/Subscriptions",
"@odata.type": "#EventDestinationCollection.EventDestinationCollection",
"Description": "Collection for Event Subscriptions",
"Members": [
{
"@odata.id": "/redfish/v1/EventService/Subscriptions/1"
}],
"Members@odata.count": 1,
"Name": "Event Subscriptions Collection"
}
In this example, a single subscription is configured:
/redfish/v1/EventService/Subscriptions/1
.Optional: To remove the
/redfish/v1/EventService/Subscriptions/1
subscription withcurl
, run the following command, specifying the BMC username and password:$ curl --globoff -L -w "%{http_code} %{url_effective}\n" -k -u <bmc_username>:<password >-H "Content-Type: application/json" -d '{}' -X DELETE https://<bmc_ip_address>/redfish/v1/EventService/Subscriptions/1
where:
bmc_ip_address
is the IP address of the BMC where the Redfish events are generated.
Creating the bare-metal event and Secret CRs
To start using bare-metal events, create the HardwareEvent
custom resource (CR) for the host where the Redfish hardware is present. Hardware events and faults are reported in the hw-event-proxy
logs.
Prerequisites
Install the OpenShift CLI (
oc
).Log in as a user with
cluster-admin
privileges.Install the Bare Metal Event Relay.
Create a
BMCEventSubscription
CR for the BMC Redfish hardware.
Multiple |
Procedure
Create the
HardwareEvent
custom resource (CR):Save the following YAML in the
hw-event.yaml
file:apiVersion: "event.redhat-cne.org/v1alpha1"
kind: "HardwareEvent"
metadata:
name: "hardware-event"
spec:
nodeSelector:
node-role.kubernetes.io/hw-event: "" (1)
transportHost: "amqp://amq-router-service-name.amq-namespace.svc.cluster.local" (2)
logLevel: "debug" (3)
msgParserTimeout: "10" (4)
1 Required. Use the nodeSelector
field to target nodes with the specified label, for example,node-role.kubernetes.io/hw-event: “”
.2 Required. AMQP host that delivers the events at the transport layer using the AMQP protocol. 3 Optional. The default value is debug
. Sets the log level inhw-event-proxy
logs. The following log levels are available:fatal
,error
,warning
,info
,debug
,trace
.4 Optional. Sets the timeout value in milliseconds for the Message Parser. If a message parsing request is not responded to within the timeout duration, the original hardware event message is passed to the cloud native event framework. The default value is 10. Create the
HardwareEvent
CR:$ oc create -f hardware-event.yaml
Create a BMC username and password
Secret
CR that enables the hardware events proxy to access the Redfish message registry for the bare-metal host.Save the following YAML in the
hw-event-bmc-secret.yaml
file:apiVersion: v1
kind: Secret
metadata:
name: redfish-basic-auth
type: Opaque
stringData: (1)
username: <bmc_username>
password: <bmc_password>
# BMC host DNS or IP address
hostaddr: <bmc_host_ip_address>
1 Enter plain text values for the various items under stringData
.Create the
Secret
CR:$ oc create -f hw-event-bmc-secret.yaml
Subscribing applications to bare-metal events REST API reference
Use the bare-metal events REST API to subscribe an application to the bare-metal events that are generated on the parent node.
Subscribe applications to Redfish events by using the resource address /cluster/node/<node_name>/redfish/event
, where <node_name>
is the cluster node running the application.
Deploy your cloud-event-consumer
application container and cloud-event-proxy
sidecar container in a separate application pod. The cloud-event-consumer
application subscribes to the cloud-event-proxy
container in the application pod.
Use the following API endpoints to subscribe the cloud-event-consumer
application to Redfish events posted by the cloud-event-proxy
container at http://localhost:8089/api/cloudNotifications/v1/
in the application pod:
/api/cloudNotifications/v1/subscriptions
POST
: Creates a new subscriptionGET
: Retrieves a list of subscriptions
/api/cloudNotifications/v1/subscriptions/<subscription_id>
GET
: Returns details for the specified subscription ID
api/cloudNotifications/v1/subscriptions/status/<subscription_id>
PUT
: Creates a new status ping request for the specified subscription ID
/api/cloudNotifications/v1/health
GET
: Returns the health status ofcloudNotifications
API
|
api/cloudNotifications/v1/subscriptions
HTTP method
GET api/cloudNotifications/v1/subscriptions
Description
Returns a list of subscriptions. If subscriptions exist, a 200 OK
status code is returned along with the list of subscriptions.
Example API response
[
{
"id": "ca11ab76-86f9-428c-8d3a-666c24e34d32",
"endpointUri": "http://localhost:9089/api/cloudNotifications/v1/dummy",
"uriLocation": "http://localhost:8089/api/cloudNotifications/v1/subscriptions/ca11ab76-86f9-428c-8d3a-666c24e34d32",
"resource": "/cluster/node/openshift-worker-0.openshift.example.com/redfish/event"
}
]
HTTP method
POST api/cloudNotifications/v1/subscriptions
Description
Creates a new subscription. If a subscription is successfully created, or if it already exists, a 201 Created
status code is returned.
Parameter | Type |
---|---|
subscription | data |
Example payload
{
"uriLocation": "http://localhost:8089/api/cloudNotifications/v1/subscriptions",
"resource": "/cluster/node/openshift-worker-0.openshift.example.com/redfish/event"
}
api/cloudNotifications/v1/subscriptions/<subscription_id>
HTTP method
GET api/cloudNotifications/v1/subscriptions/<subscription_id>
Description
Returns details for the subscription with ID <subscription_id>
Parameter | Type |
---|---|
| string |
Example API response
{
"id":"ca11ab76-86f9-428c-8d3a-666c24e34d32",
"endpointUri":"http://localhost:9089/api/cloudNotifications/v1/dummy",
"uriLocation":"http://localhost:8089/api/cloudNotifications/v1/subscriptions/ca11ab76-86f9-428c-8d3a-666c24e34d32",
"resource":"/cluster/node/openshift-worker-0.openshift.example.com/redfish/event"
}
api/cloudNotifications/v1/subscriptions/status/<subscription_id>
HTTP method
PUT api/cloudNotifications/v1/subscriptions/status/<subscription_id>
Description
Creates a new status ping request for subscription with ID <subscription_id>
. If a subscription is present, the status request is successful and a 202 Accepted
status code is returned.
Parameter | Type |
---|---|
| string |
Example API response
{"status":"ping sent"}
api/cloudNotifications/v1/health/
HTTP method
GET api/cloudNotifications/v1/health/
Description
Returns the health status for the cloudNotifications
REST API.
Example API response
OK