Using Device Plug-ins
What Device Plug-ins Do
Device plug-ins allow you to use a particular device type (GPU, InfiniBand, or other similar computing resources that require vendor-specific initialization and setup) in your OKD pod without needing to write custom code. The device plug-in provides a consistent and portable solution to consume hardware devices across clusters. The device plug-in provides support for these devices through an extension mechanism, which makes these devices available to containers, provides health checks of these devices, and securely shares them.
OKD supports the device plug-in API, but the device plug-in containers are supported by individual vendors. |
A device plug-in is a gRPC service running on the nodes (external to atomic-openshift-node.service
) that is responsible for managing specific hardware resources. Any device plug-in must support following remote procedure calls (RPCs):
service DevicePlugin {
// GetDevicePluginOptions returns options to be communicated with Device
// Manager
rpc GetDevicePluginOptions(Empty) returns (DevicePluginOptions) {}
// ListAndWatch returns a stream of List of Devices
// Whenever a Device state change or a Device disappears, ListAndWatch
// returns the new list
rpc ListAndWatch(Empty) returns (stream ListAndWatchResponse) {}
// Allocate is called during container creation so that the Device
// Plug-in can run device specific operations and instruct Kubelet
// of the steps to make the Device available in the container
rpc Allocate(AllocateRequest) returns (AllocateResponse) {}
// PreStartContainer is called, if indicated by Device Plug-in during
// registration phase, before each container start. Device plug-in
// can run device specific operations such as reseting the device
// before making devices available to the container
rpc PreStartContainer(PreStartContainerRequest) returns (PreStartContainerResponse) {}
}
Example Device Plug-ins
For easy device plug-in reference implementation, there is a stub device plug-in in the Device Manager code: vendor/k8s.io/kubernetes/pkg/kubelet/cm/deviceplugin/device_plugin_stub.go. |
Methods for Deploying a Device Plug-in
Daemonsets are the recommended approach for device plug-in deployments.
Upon start, the device plug-in will try to create a UNIX domain socket at /var/lib/kubelet/device-plugin/ on the node to serve RPCs from Device Manager.
Since device plug-ins need to manage hardware resources, access to the host file system, as well as socket creation, they must be run in a privileged security context.
More specific details regarding deployment steps can be found with each device plug-in implementation.