Handling a full Ceph file system
When a RADOS cluster reaches its mon_osd_full_ratio
(default95%) capacity, it is marked with the OSD full flag. This flag causesmost normal RADOS clients to pause all operations until it is resolved(for example by adding more capacity to the cluster).
The file system has some special handling of the full flag, explained below.
Hammer and later
Since the hammer release, a full file system will lead to ENOSPCresults from:
Data writes on the client
Metadata operations other than deletes and truncates
Because the full condition may not be encountered untildata is flushed to disk (sometime after a write
call has alreadyreturned 0), the ENOSPC error may not be seen until the applicationcalls fsync
or fclose
(or equivalent) on the file handle.
Calling fsync
is guaranteed to reliably indicate whether the datamade it to disk, and will return an error if it doesn’t. fclose
willonly return an error if buffered data happened to be flushed sincethe last write – a successful fclose
does not guarantee that thedata made it to disk, and in a full-space situation, buffered datamay be discarded after an fclose
if no space is available to persist it.
Warning
If an application appears to be misbehaving on a full file system,check that it is performing fsync()
calls as necessary to ensuredata is on disk before proceeding.
Data writes may be cancelled by the client if they are in flight at thetime the OSD full flag is sent. Clients update the osd_epoch_barrier
when releasing capabilities on files affected by cancelled operations, inorder to ensure that these cancelled operations do not interfere withsubsequent access to the data objects by the MDS or other clients. Formore on the epoch barrier mechanism, see Background: Blacklisting and OSD epoch barrier.
Legacy (pre-hammer) behavior
In versions of Ceph earlier than hammer, the MDS would ignorethe full status of the RADOS cluster, and any data writes fromclients would stall until the cluster ceased to be full.
There are two dangerous conditions to watch for with this behaviour:
If a client had pending writes to a file, then it was not possiblefor the client to release the file to the MDS for deletion: this couldlead to difficulty clearing space on a full file system
If clients continued to create a large number of empty files, theresulting metadata writes from the MDS could lead to total exhaustionof space on the OSDs such that no further deletions could be performed.