Persistent Storage
You are viewing documentation for a release that is no longer supported. The latest supported version of version 3 is [3.11]. For the most recent version 4, see [4]
You are viewing documentation for a release that is no longer supported. The latest supported version of version 3 is [3.11]. For the most recent version 4, see [4]
Overview
Managing storage is a distinct problem from managing compute resources. OKD uses the Kubernetes persistent volume (PV) framework to allow cluster administrators to provision persistent storage for a cluster. Developers can use persistent volume claims (PVCs) to request PV resources without having specific knowledge of the underlying storage infrastructure.
PVCs are specific to a project and are created and used by developers as a means to use a PV. PV resources on their own are not scoped to any single project; they can be shared across the entire OKD cluster and claimed from any project. After a PV is bound to a PVC, however, that PV cannot then be bound to additional PVCs. This has the effect of scoping a bound PV to a single namespace (that of the binding project).
PVs are defined by a PersistentVolume
API object, which represents a piece of existing, networked storage in the cluster that was provisioned by the cluster administrator. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plug-ins like Volumes
but have a lifecycle that is independent of any individual pod that uses the PV. PV objects capture the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.
High availability of storage in the infrastructure is left to the underlying storage provider. |
PVCs are defined by a PersistentVolumeClaim
API object, which represents a request for storage by a developer. It is similar to a pod in that pods consume node resources and PVCs consume PV resources. For example, pods can request specific levels of resources (e.g., CPU and memory), while PVCs can request specific storage capacity and access modes (e.g, they can be mounted once read/write or many times read-only).
Lifecycle of a volume and claim
PVs are resources in the cluster. PVCs are requests for those resources and also act as claim checks to the resource. The interaction between PVs and PVCs have the following lifecycle.
Provision storage
In response to requests from a developer defined in a PVC, a cluster administrator configures one or more dynamic provisioners that provision storage and a matching PV.
Alternatively, a cluster administrator can create a number of PVs in advance that carry the details of the real storage that is available for use. PVs exist in the API and are available for use.
Bind claims
When you create a PVC, you request a specific amount of storage, specify the required access mode, and create a storage class to describe and classify the storage. The control loop in the master watches for new PVCs and binds the new PVC to an appropriate PV. If an appropriate PV does not exist, a provisioner for the storage class creates one.
The PV volume might exceed your requested volume. This is especially true with manually provisioned PVs. To minimize the excess, OKD binds to the smallest PV that matches all other criteria.
Claims remain unbound indefinitely if a matching volume does not exist or cannot be created with any available provisioner servicing a storage class. Claims are bound as matching volumes become available. For example, a cluster with many manually provisioned 50Gi volumes would not match a PVC requesting 100Gi. The PVC can be bound when a 100Gi PV is added to the cluster.
Use pods and claimed PVs
Pods use claims as volumes. The cluster inspects the claim to find the bound volume and mounts that volume for a pod. For those volumes that support multiple access modes, you must specify which mode applies when you use the claim as a volume in a pod.
After you have a claim and that claim is bound, the bound PV belongs to you for as long as you need it. You can schedule pods and access claimed PVs by including persistentVolumeClaim
in the pod’s volumes block. See below for syntax details.
PVC protection
PVC protection is enabled by default.
Release volumes
When you are finished with a volume, you can delete the PVC object from the API, which allows reclamation of the resource. The volume is considered “released” when the claim is deleted, but it is not yet available for another claim. The previous claimant’s data remains on the volume and must be handled according to policy.
Reclaim volumes
The reclaim policy of a PersistentVolume
tells the cluster what to do with the volume after it is released. Volumes reclaim policy can either be Retain
, Recycle
, or Delete
.
Retain
reclaim policy allows manual reclamation of the resource for those volume plug-ins that support it. Delete
reclaim policy deletes both the PersistentVolume
object from OKD and the associated storage asset in external infrastructure, such as AWS EBS, GCE PD, or Cinder volume.
Dynamically provisioned volumes have a default |
Recycle volumes
If supported by appropriate volume plug-in, recycling performs a basic scrub (rm -rf /thevolume/*
) on the volume and makes it available again for a claim.
The |
You can configure a custom recycler pod template by using the controller manager command line arguments as described in the ControllerArguments section. The custom recycler pod template must contain a volumes
specification, for example:
Custom recycler pod template example
apiVersion: v1
kind: Pod
metadata:
name: pv-recycler-
namespace: openshift-infra (1)
spec:
restartPolicy: Never
serviceAccount: pv-recycler-controller (2)
volumes:
- name: nfsvol
nfs:
server: any-server-it-will-be-replaced (3)
path: any-path-it-will-be-replaced (3)
containers:
- name: pv-recycler
image: "gcr.io/google_containers/busybox"
command: ["/bin/sh", "-c", "test -e /scrub && rm -rf /scrub/..?* /scrub/.[!.]* /scrub/* && test -z \"$(ls -A /scrub)\" || exit 1"]
volumeMounts:
- name: nfsvol
mountPath: /scrub
1 | Namespace where the recycler pod runs. openshift-infra is the recommended namespace, as it already has a pv-recycler-controller service account that can recycle volumes. |
2 | Name of service account that is allowed to mount NFS volumes. It must exist in the specified namespace. A pv-recycler-controller account is recommended, as it is automatically created in openshift-infra namespace and has all the required permissions. |
3 | The particular server and path values specified in the custom recycler pod template in the volumes part is replaced with the particular corresponding values from the PV that is being recycled. |
Persistent volumes
Each PV contains a spec
and status
, which is the specification and status of the volume, for example:
PV object definition example
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv0003
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
nfs:
path: /tmp
server: 172.17.0.2
Types of PVs
OKD supports the following PersistentVolume
plug-ins:
HostPath
Capacity
Generally, a PV has a specific storage capacity. This is set by using the PV’s capacity
attribute.
Currently, storage capacity is the only resource that can be set or requested. Future attributes may include IOPS, throughput, and so on.
Access modes
A PersistentVolume
can be mounted on a host in any way supported by the resource provider. Providers will have different capabilities and each PV’s access modes are set to the specific modes supported by that particular volume. For example, NFS can support multiple read/write clients, but a specific NFS PV might be exported on the server as read-only. Each PV gets its own set of access modes describing that specific PV’s capabilities.
Claims are matched to volumes with similar access modes. The only two matching criteria are access modes and size. A claim’s access modes represent a request. Therefore, you might be granted more, but never less. For example, if a claim requests RWO, but the only volume available is an NFS PV (RWO+ROX+RWX), the claim would then match NFS because it supports RWO.
Direct matches are always attempted first. The volume’s modes must match or contain more modes than you requested. The size must be greater than or equal to what is expected. If two types of volumes (NFS and iSCSI, for example) have the same set of access modes, either of them can match a claim with those modes. There is no ordering between types of volumes and no way to choose one type over another.
All volumes with the same modes are grouped, and then sorted by size (smallest to largest). The binder gets the group with matching modes and iterates over each (in size order) until one size matches.
The following table lists the access modes:
Access Mode | CLI abbreviation | Description |
---|---|---|
ReadWriteOnce |
| The volume can be mounted as read-write by a single node. |
ReadOnlyMany |
| The volume can be mounted read-only by many nodes. |
ReadWriteMany |
| The volume can be mounted as read-write by many nodes. |
A volume’s For example, Ceph offers ReadWriteOnce access mode. You must mark the claims as iSCSI and Fibre Channel volumes do not currently have any fencing mechanisms. You must ensure the volumes are only used by one node at a time. In certain situations, such as draining a node, the volumes can be used simultaneously by two nodes. Before draining the node, first ensure the pods that use these volumes are deleted. |
The following table lists the access modes supported by different PVs:
Volume Plug-in | ReadWriteOnce | ReadOnlyMany | ReadWriteMany |
---|---|---|---|
AWS EBS | ✅ | - | - |
Azure File | ✅ | ✅ | ✅ |
Azure Disk | ✅ | - | - |
Ceph RBD | ✅ | ✅ | - |
Fibre Channel | ✅ | ✅ | - |
GCE Persistent Disk | ✅ | - | - |
GlusterFS | ✅ | ✅ | ✅ |
HostPath | ✅ | - | - |
iSCSI | ✅ | ✅ | - |
NFS | ✅ | ✅ | ✅ |
Openstack Cinder | ✅ | - | - |
VMWare vSphere | ✅ | - | - |
Local | ✅ | - | - |
Use a recreate deployment strategy for pods that rely on AWS EBS, GCE Persistent Disks, or Openstack Cinder PVs. |
Reclaim policy
The following table lists current reclaim policies:
Reclaim policy | Description |
---|---|
Retain | Manual reclamation |
Recycle | Basic scrub (e.g, |
Currently, only NFS and HostPath support the ‘Recycle’ reclaim policy. |
The |
Phase
Volumes can be found in one of the following phases:
Phase | Description |
---|---|
Available | A free resource not yet bound to a claim. |
Bound | The volume is bound to a claim. |
Released | The claim was deleted, but the resource is not yet reclaimed by the cluster. |
Failed | The volume has failed its automatic reclamation. |
The CLI shows the name of the PVC bound to the PV.
Mount options
You can specify mount options while mounting a persistent volume by using the annotation volume.beta.kubernetes.io/mount-options
.
For example:
Mount options example
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv0001
annotations:
volume.beta.kubernetes.io/mount-options: rw,nfsvers=4,noexec (1)
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
nfs:
path: /tmp
server: 172.17.0.2
persistentVolumeReclaimPolicy: Recycle
claimRef:
name: claim1
namespace: default
1 | Specified mount options are used while mounting the PV to the disk. |
The following persistent volume types support mount options:
NFS
GlusterFS
Ceph RBD
OpenStack Cinder
AWS Elastic Block Store (EBS)
GCE Persistent Disk
iSCSI
Azure Disk
Azure File
VMWare vSphere
Fibre Channel and HostPath persistent volumes do not support mount options. |
Persistent volume claims
Each PVC contains a spec
and status
, which is the specification and status of the claim, for example:
PVC object definition example
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: myclaim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
storageClassName: gold
Storage classes
Claims can optionally request a specific storage class by specifying the storage class’s name in the storageClassName
attribute. Only PVs of the requested class, ones with the same storageClassName
as the PVC, can be bound to the PVC. The cluster administrator can configure dynamic provisioners to service one or more storage classes. The cluster administrator can create a PV on demand that matches the specifications in the PVC.
The cluster administrator can also set a default storage class for all PVCs. When a default storage class is configured, the PVC must explicitly ask for StorageClass
or storageClassName
annotations set to ""
to be bound to a PV without a storage class.
Access modes
Claims use the same conventions as volumes when requesting storage with specific access modes.
Resources
Claims, such as pods, can request specific quantities of a resource. In this case, the request is for storage. The same resource model applies to volumes and claims.
Claims as volumes
Pods access storage by using the claim as a volume. Claims must exist in the same namespace as the pod by using the claim. The cluster finds the claim in the pod’s namespace and uses it to get the PersistentVolume
backing the claim. The volume is mounted to the host and into the pod, for example:
Mount volume to the host and into the pod example
kind: Pod
apiVersion: v1
metadata:
name: mypod
spec:
containers:
- name: myfrontend
image: dockerfile/nginx
volumeMounts:
- mountPath: "/var/www/html"
name: mypd
volumes:
- name: mypd
persistentVolumeClaim:
claimName: myclaim
Block volume support
Block volume support is a Technology Preview feature and it is only available for manually provisioned PVs. |
You can statically provision raw block volumes by including API fields in your PV and PVC specifications.
To use block volume, you must first enable the BlockVolume
feature gate. To enable the feature gates for master(s), add feature-gates
to apiServerArguments
and controllerArguments
. To enable the feature gates for node(s), add feature-gates
to kubeletArguments
. For example:
kubeletArguments:
feature-gates:
- BlockVolume=true
PV example
apiVersion: v1
kind: PersistentVolume
metadata:
name: block-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
volumeMode: Block (1)
persistentVolumeReclaimPolicy: Retain
fc:
targetWWNs: ["50060e801049cfd1"]
lun: 0
readOnly: false
1 | volumeMode field indicating that this PV is a raw block volume. |
PVC example
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: block-pvc
spec:
accessModes:
- ReadWriteOnce
volumeMode: Block (1)
resources:
requests:
storage: 10Gi
1 | volumeMode field indicating that a raw block persistent volume is requested. |
Pod specification example
apiVersion: v1
kind: Pod
metadata:
name: pod-with-block-volume
spec:
containers:
- name: fc-container
image: fedora:26
command: ["/bin/sh", "-c"]
args: [ "tail -f /dev/null" ]
volumeDevices: (1)
- name: data
devicePath: /dev/xvda (2)
volumes:
- name: data
persistentVolumeClaim:
claimName: block-pvc (3)
1 | volumeDevices (similar to volumeMounts ) is used for block devices and can only be used with PersistentVolumeClaim sources. |
2 | devicePath (similar to mountPath ) represents the path to the physical device. |
3 | The volume source must be of type persistentVolumeClaim and must match the name of the PVC as expected. |
Value | Default |
---|---|
Filesystem | Yes |
Block | No |
PV VolumeMode | PVC VolumeMode | Binding Result |
---|---|---|
Filesystem | Filesystem | Bind |
Unspecified | Unspecified | Bind |
Filesystem | Unspecified | Bind |
Unspecified | Filesystem | Bind |
Block | Block | Bind |
Unspecified | Block | No Bind |
Block | Unspecified | No Bind |
Filesystem | Block | No Bind |
Block | Filesystem | No Bind |
Unspecified values result in the default value of Filesystem. |