Using Hadoop with CephFS
The Ceph file system can be used as a drop-in replacement for the Hadoop FileSystem (HDFS). This page describes the installation and configuration processof using Ceph with Hadoop.
Dependencies
CephFS Java Interface
Hadoop CephFS Plugin
Important
Currently requires Hadoop 1.1.X stable series
Installation
There are three requirements for using CephFS with Hadoop. First, a runningCeph installation is required. The details of setting up a Ceph cluster andthe file system are beyond the scope of this document. Please refer to theCeph documentation for installing Ceph.
The remaining two requirements are a Hadoop installation, and the Ceph filesystem Java packages, including the Java CephFS Hadoop plugin. The high-levelsteps are two add the dependencies to the Hadoop installation CLASSPATH
,and configure Hadoop to use the Ceph file system.
CephFS Java Packages
- CephFS Hadoop plugin (hadoop-cephfs.jar)
Adding these dependencies to a Hadoop installation will depend on yourparticular deployment. In general the dependencies must be present on eachnode in the system that will be part of the Hadoop cluster, and must be in theCLASSPATH
searched for by Hadoop. Typically approaches are to place theadditional jar
files into the hadoop/lib
directory, or to edit theHADOOP_CLASSPATH
variable in hadoop-env.sh
.
The native Ceph file system client must be installed on each participatingnode in the Hadoop cluster.
Hadoop Configuration
This section describes the Hadoop configuration options used to control Ceph.These options are intended to be set in the Hadoop configuration fileconf/core-site.xml.
Property | Value | Notes |
---|---|---|
fs.default.name | Ceph URI | ceph://[monaddr:port]/ |
ceph.conf.file | Local path to ceph.conf | /etc/ceph/ceph.conf |
ceph.conf.options | Comma separated list ofCeph configurationkey/value pairs | opt1=val1,opt2=val2 |
ceph.root.dir | Mount root directory | Default value: / |
ceph.mon.address | Monitor address | host:port |
ceph.auth.id | Ceph user id | Example: admin |
ceph.auth.keyfile | Ceph key file | |
ceph.auth.keyring | Ceph keyring file | |
ceph.object.size | Default file object sizein bytes | Default value (64MB):67108864 |
ceph.data.pools | List of Ceph data poolsfor storing file. | Default value: default Cephpool. |
ceph.localize.reads | Allow reading from filereplica objects | Default value: true |
Support For Per-file Custom Replication
The Hadoop file system interface allows users to specify a custom replicationfactor (e.g. 3 copies of each block) when creating a file. However, objectreplication factors in the Ceph file system are controlled on a per-poolbasis, and by default a Ceph file system will contain only a singlepre-configured pool. Thus, in order to support per-file replication withHadoop over Ceph, additional storage pools with non-default replicationsfactors must be created, and Hadoop must be configured to choose from theseadditional pools.
Additional data pools can be specified using the ceph.data.pools
configuration option. The value of the option is a comma separated list ofpool names. The default Ceph pool will be used automatically if thisconfiguration option is omitted or the value is empty. For example, thefollowing configuration setting will consider the pools pool1
, pool2
, andpool5
when selecting a target pool to store a file.
- <property>
- <name>ceph.data.pools</name>
- <value>pool1,pool2,pool5</value>
- </property>
Hadoop will not create pools automatically. In order to create a new pool witha specific replication factor use the ceph osd pool create
command, and thenset the size
property on the pool using the ceph osd pool set
command. Formore information on creating and configuring pools see the RADOS Pooldocumentation.
Once a pool has been created and configured the metadata service must be toldthat the new pool may be used to store file data. A pool is be made availablefor storing file system data using the ceph fs add_data_pool
command.
First, create the pool. In this example we create the hadoop1
pool withreplication factor 1.
- ceph osd pool create hadoop1
- ceph osd pool set hadoop1 size 1
Next, determine the pool id. This can be done by examining the output of theceph osd dump
command. For example, we can look for the newly createdhadoop1
pool.
- ceph osd dump | grep hadoop1
The output should resemble:
- pool 3 'hadoop1' rep size 1 min_size 1 crush_rule 0...
Where 3
is the pool id. Next we will use the pool id reference to registerthe pool as a data pool for storing file system data.
- ceph fs add_data_pool cephfs 3
The final step is to configure Hadoop to consider this data pool whenselecting the target pool for new files.
- <property>
- <name>ceph.data.pools</name>
- <value>hadoop1</value>
- </property>
Pool Selection Rules
The following rules describe how Hadoop chooses a pool given a desiredreplication factor and the set of pools specified using theceph.data.pools
configuration option.
When no custom pools are specified the default Ceph data pool is used.
A custom pool with the same replication factor as the default Ceph datapool will override the default.
A pool with a replication factor that matches the desired replication willbe chosen if it exists.
Otherwise, a pool with at least the desired replication factor will bechosen, or the maximum possible.
Debugging Pool Selection
Hadoop will produce log file entry when it cannot determine the replicationfactor of a pool (e.g. it is not configured as a data pool). The log messagewill appear as follows:
- Error looking up replication of pool: <pool name>
Hadoop will also produce a log entry when it wasn’t able to select an exactmatch for replication. This log entry will appear as follows:
- selectDataPool path=<path> pool:repl=<name>:<value> wanted=<value>