librados 简介
The Ceph Storage Cluster provides the basic storage service that allowsCeph to uniquely deliver object, block, and file storage in oneunified system. However, you are not limited to using the RESTful, block, orPOSIX interfaces. Based upon RADOS, the librados API enables you to create your own interface to theCeph Storage Cluster.
The librados API enables you to interact with the two types of daemons inthe Ceph Storage Cluster:
- The Ceph Monitor, which maintains a master copy of the cluster map.
- The Ceph OSD Daemon (OSD), which stores data as objects on a storage node.
This guide provides a high-level introduction to using librados.Refer to 体系结构 for additional details of the CephStorage Cluster. To use the API, you need a running Ceph Storage Cluster.See Installation (Quick) for details.
第一步:获取 librados
你的客户端应用必须绑定 librados 才能连接 Ceph 存储集群。在写使用 librados 的应用程序前,要安装 librados 及其他依赖包。 librados API 本身是用 C++ 实现的,另外有 C 、 Python 、 Java 和 PHP 绑定。
Getting librados for C/C++
To install librados development support files for C/C++ on Debian/Ubuntudistributions, execute the following:
- sudo apt-get install librados-dev
To install librados development support files for C/C++ on RHEL/CentOSdistributions, execute the following:
- sudo yum install librados2-devel
Once you install librados for developers, you can find the requiredheaders for C/C++ under /usr/include/rados.
- ls /usr/include/rados
获取 librados 的 Python 支持
rados.py 模块为 Python 应用提供了 librados 支持。在 Debian/Ubuntu 下软件包名为 librados-dev ,在 RHEL/CentOS 下是 librados2-devel ,它们包含了 python-rados 包。你也可以直接安装 python-rados 。
要在 Debian/Ubuntu 发行版上安装 librados 的 Python 开发支持文件,用此命令:
- sudo apt-get install python-rados
要在 RHEL/CentOS 发行版上安装 librados 的 Python 开发支持文件,用此命令:
- sudo yum install python-rados
此模块在 Debian 风格的系统上安装到了 /usr/share/pyshared ,在 CentOS/RHEL 系统上安装到了 /usr/lib/python*/site-packages 。
获取 librados 的 Java 支持
要安装 librados 的 Java 支持,你需要执行下列步骤:
- 安装 jna.jar 。在 Debian/Ubuntu 系统下应执行:
- sudo apt-get install libjna-java
在 CentOS/RHEL 下应执行:
- sudo yum install jna
JAR 文件位于 /usr/share/java 。
- 克隆 rados-java 软件库:
- git clone --recursive https://github.com/ceph/rados-java.git
- 构建 rados-java 软件库:
- cd rados-java
- ant
JAR 文件位于 rados-java/target 。
- 把 RADOS 的 JAR 文件复制到统一位置(如 /usr/share/java ),并确保它和 JNA JAR 都位于 JVM 的类路径里。例如:
- sudo cp target/rados-0.1.3.jar /usr/share/java/rados-0.1.3.jar
- sudo ln -s /usr/share/java/jna-3.2.7.jar /usr/lib/jvm/default-java/jre/lib/ext/jna-3.2.7.jar
- sudo ln -s /usr/share/java/rados-0.1.3.jar /usr/lib/jvm/default-java/jre/lib/ext/rados-0.1.3.jar
要编译文档,用下列命令:
- ant docs
获取 librados 的 PHP 绑定
要安装 librados 的 PHP 扩展,可按如下步骤:
- 安装 php-dev ,在 Debian/Ubuntu 下应该执行:
- sudo apt-get install php5-dev build-essential
在 CentOS/RHEL 下应该执行:
- sudo yum install php-devel
- 克隆 phprados 源码库:
- git clone https://github.com/ceph/phprados.git
- 构建 phprados:
- cd phprados
- phpize
- ./configure
- make
- sudo make install
- 把下列配置加入 php.ini 以启用 phprados:
- extension=rados.so
第二步:配置集群句柄
A Ceph Client, via librados, interacts directly with OSDs to storeand retrieve data. To interact with OSDs, the client app must invokelibrados and connect to a Ceph Monitor. Once connected, libradosretrieves the Cluster Map from the Ceph Monitor. When the client appwants to read or write data, it creates an I/O context and binds to apool. The pool has an associated ruleset that defines how itwill place data in the storage cluster. Via the I/O context, the clientprovides the object name to librados, which takes the object nameand the cluster map (i.e., the topology of the cluster) and computes theplacement group and OSD for locating the data. Then the client applicationcan read or write data. The client app doesn’t need to learn about the topologyof the cluster directly.
The Ceph Storage Cluster handle encapsulates the client configuration, including:
- The user ID for rados_create() or user name for rados_create2()(preferred).
- The cephx authentication key
- The monitor ID and IP address
- Logging levels
- Debugging levels
Thus, the first steps in using the cluster from your app are to 1) createa cluster handle that your app will use to connect to the storage cluster,and then 2) use that handle to connect. To connect to the cluster, theapp must supply a monitor address, a username and an authentication key(cephx is enabled by default).
Tip
Talking to different Ceph Storage Clusters – or to the same clusterwith different users – requires different cluster handles.
RADOS provides a number of ways for you to set the required values. Forthe monitor and encryption key settings, an easy way to handle them is to ensurethat your Ceph configuration file contains a keyring path to a keyring fileand at least one monitor address (e.g,. monhost). For example:
- [global]
- mon host = 192.168.1.1
- keyring = /etc/ceph/ceph.client.admin.keyring
Once you create the handle, you can read a Ceph configuration file to configurethe handle. You can also pass arguments to your app and parse them with thefunction for parsing command line arguments (e.g., rados_conf_parse_argv()),or parse Ceph environment variables (e.g., rados_conf_parse_env()). Somewrappers may not implement convenience methods, so you may need to implementthese capabilities. The following diagram provides a high-level flow for theinitial connection.
Once connected, your app can invoke functions that affect the whole clusterwith only the cluster handle. For example, once you have a clusterhandle, you can:
- Get cluster statistics
- Use Pool Operation (exists, create, list, delete)
- Get and set the configuration
One of the powerful features of Ceph is the ability to bind to different pools.Each pool may have a different number of placement groups, object replicas andreplication strategies. For example, a pool could be set up as a “hot” pool thatuses SSDs for frequently used objects or a “cold” pool that uses erasure coding.
The main difference in the various librados bindings is between C andthe object-oriented bindings for C++, Java and Python. The object-orientedbindings use objects to represent cluster handles, IO Contexts, iterators,exceptions, etc.
C Example
For C, creating a simple cluster handle using the admin user, configuringit and connecting to the cluster might look something like this:
- #include <stdio.h>
- #include <string.h>
- #include <rados/librados.h>
- int main (int argc, char argv**)
- {
- /* Declare the cluster handle and required arguments. */
- rados_t cluster;
- char cluster_name[] = "ceph";
- char user_name[] = "client.admin";
- uint64_t flags;
- /* Initialize the cluster handle with the "ceph" cluster name and the "client.admin" user */
- int err;
- err = rados_create2(&cluster, cluster_name, user_name, flags);
- if (err < 0) {
- fprintf(stderr, "%s: Couldn't create the cluster handle! %s\n", argv[0], strerror(-err));
- exit(EXIT_FAILURE);
- } else {
- printf("\nCreated a cluster handle.\n");
- }
- /* Read a Ceph configuration file to configure the cluster handle. */
- err = rados_conf_read_file(cluster, "/etc/ceph/ceph.conf");
- if (err < 0) {
- fprintf(stderr, "%s: cannot read config file: %s\n", argv[0], strerror(-err));
- exit(EXIT_FAILURE);
- } else {
- printf("\nRead the config file.\n");
- }
- /* Read command line arguments */
- err = rados_conf_parse_argv(cluster, argc, argv);
- if (err < 0) {
- fprintf(stderr, "%s: cannot parse command line arguments: %s\n", argv[0], strerror(-err));
- exit(EXIT_FAILURE);
- } else {
- printf("\nRead the command line arguments.\n");
- }
- /* Connect to the cluster */
- err = rados_connect(cluster);
- if (err < 0) {
- fprintf(stderr, "%s: cannot connect to cluster: %s\n", argv[0], strerror(-err));
- exit(EXIT_FAILURE);
- } else {
- printf("\nConnected to the cluster.\n");
- }
- }
Compile your client and link to librados using -lrados. For example:
- gcc ceph-client.c -lrados -o ceph-client
C++ Example
The Ceph project provides a C++ example in the ceph/examples/libradosdirectory. For C++, a simple cluster handle using the admin user requiresyou to initialize a librados::Rados cluster handle object:
- #include <iostream>
- #include <string>
- #include <rados/librados.hpp>
- int main(int argc, const char **argv)
- {
- int ret = 0;
- /* Declare the cluster handle and required variables. */
- librados::Rados cluster;
- char cluster_name[] = "ceph";
- char user_name[] = "client.admin";
- uint64_t flags;
- /* Initialize the cluster handle with the "ceph" cluster name and "client.admin" user */
- {
- ret = cluster.init2(user_name, cluster_name, flags);
- if (ret < 0) {
- std::cerr << "Couldn't initialize the cluster handle! error " << ret << std::endl;
- ret = EXIT_FAILURE;
- return 1;
- } else {
- std::cout << "Created a cluster handle." << std::endl;
- }
- }
- /* Read a Ceph configuration file to configure the cluster handle. */
- {
- ret = cluster.conf_read_file("/etc/ceph/ceph.conf");
- if (ret < 0) {
- std::cerr << "Couldn't read the Ceph configuration file! error " << ret << std::endl;
- ret = EXIT_FAILURE;
- return 1;
- } else {
- std::cout << "Read the Ceph configuration file." << std::endl;
- }
- }
- /* Read command line arguments */
- {
- ret = cluster.conf_parse_argv(argc, argv);
- if (ret < 0) {
- std::cerr << "Couldn't parse command line options! error " << ret << std::endl;
- ret = EXIT_FAILURE;
- return 1;
- } else {
- std::cout << "Parsed command line options." << std::endl;
- }
- }
- /* Connect to the cluster */
- {
- ret = cluster.connect();
- if (ret < 0) {
- std::cerr << "Couldn't connect to cluster! error " << ret << std::endl;
- ret = EXIT_FAILURE;
- return 1;
- } else {
- std::cout << "Connected to the cluster." << std::endl;
- }
- }
- return 0;
- }
Compile the source; then, link librados using -lrados.For example:
- g++ -g -c ceph-client.cc -o ceph-client.o
- g++ -g ceph-client.o -lrados -o ceph-client
Python Example
Python uses the admin id and the ceph cluster name by default, andwill read the standard ceph.conf file if the conffile parameter isset to the empty string. The Python binding converts C++ errorsinto exceptions.
- import rados
- try:
- cluster = rados.Rados(conffile='')
- except TypeError as e:
- print 'Argument validation error: ', e
- raise e
- print "Created cluster handle."
- try:
- cluster.connect()
- except Exception as e:
- print "connection error: ", e
- raise e
- finally:
- print "Connected to the cluster."
Execute the example to verify that it connects to your cluster.
- python ceph-client.py
Java Example
Java requires you to specify the user ID (admin) or user name(client.admin), and uses the ceph cluster name by default . The Javabinding converts C++-based errors into exceptions.
- import com.ceph.rados.Rados;
- import com.ceph.rados.RadosException;
- import java.io.File;
- public class CephClient {
- public static void main (String args[]){
- try {
- Rados cluster = new Rados("admin");
- System.out.println("Created cluster handle.");
- File f = new File("/etc/ceph/ceph.conf");
- cluster.confReadFile(f);
- System.out.println("Read the configuration file.");
- cluster.connect();
- System.out.println("Connected to the cluster.");
- } catch (RadosException e) {
- System.out.println(e.getMessage() + ": " + e.getReturnValue());
- }
- }
- }
Compile the source; then, run it. If you have copied the JAR to/usr/share/java and sym linked from your ext directory, you won’t needto specify the classpath. For example:
- javac CephClient.java
- java CephClient
PHP 实例
在启用了 RADOS 扩展的 PHP 上,新建集群句柄非常简单:
- <?php
- $r = rados_create();
- rados_conf_read_file($r, '/etc/ceph/ceph.conf');
- if (!rados_connect($r)) {
- echo "Failed to connect to Ceph cluster";
- } else {
- echo "Successfully connected to Ceph cluster";
- }
把上述内容保存为 rados.php 并运行:
- php rados.php
Step 3: Creating an I/O Context
Once your app has a cluster handle and a connection to a Ceph Storage Cluster,you may create an I/O Context and begin reading and writing data. An I/O Contextbinds the connection to a specific pool. The user must have appropriateCAPS permissions to access the specified pool. For example, a user with readaccess but not write access will only be able to read data. I/O Contextfunctionality includes:
- Write/read data and extended attributes
- List and iterate over objects and extended attributes
- Snapshot pools, list snapshots, etc.
RADOS enables you to interact both synchronously and asynchronously. Once yourapp has an I/O Context, read/write operations only require you to know theobject/xattr name. The CRUSH algorithm encapsulated in librados uses thecluster map to identify the appropriate OSD. OSD daemons handle the replication,as described in Smart Daemons Enable Hyperscale. The librados library alsomaps objects to placement groups, as described in Calculating PG IDs.
The following examples use the default data pool. However, you may alsouse the API to list pools, ensure they exist, or create and delete pools. Forthe write operations, the examples illustrate how to use synchronous mode. Forthe read operations, the examples illustrate how to use asynchronous mode.
Important
Use caution when deleting pools with this API. If you deletea pool, the pool and ALL DATA in the pool will be lost.
C Example
- #include <stdio.h>
- #include <string.h>
- #include <rados/librados.h>
- int main (int argc, const char argv**)
- {
- /*
- * Continued from previous C example, where cluster handle and
- * connection are established. First declare an I/O Context.
- */
- rados_ioctx_t io;
- char *poolname = "data";
- err = rados_ioctx_create(cluster, poolname, &io);
- if (err < 0) {
- fprintf(stderr, "%s: cannot open rados pool %s: %s\n", argv[0], poolname, strerror(-err));
- rados_shutdown(cluster);
- exit(EXIT_FAILURE);
- } else {
- printf("\nCreated I/O context.\n");
- }
- /* Write data to the cluster synchronously. */
- err = rados_write(io, "hw", "Hello World!", 12, 0);
- if (err < 0) {
- fprintf(stderr, "%s: Cannot write object \"hw\" to pool %s: %s\n", argv[0], poolname, strerror(-err));
- rados_ioctx_destroy(io);
- rados_shutdown(cluster);
- exit(1);
- } else {
- printf("\nWrote \"Hello World\" to object \"hw\".\n");
- }
- char xattr[] = "en_US";
- err = rados_setxattr(io, "hw", "lang", xattr, 5);
- if (err < 0) {
- fprintf(stderr, "%s: Cannot write xattr to pool %s: %s\n", argv[0], poolname, strerror(-err));
- rados_ioctx_destroy(io);
- rados_shutdown(cluster);
- exit(1);
- } else {
- printf("\nWrote \"en_US\" to xattr \"lang\" for object \"hw\".\n");
- }
- /*
- * Read data from the cluster asynchronously.
- * First, set up asynchronous I/O completion.
- */
- rados_completion_t comp;
- err = rados_aio_create_completion(NULL, NULL, NULL, &comp);
- if (err < 0) {
- fprintf(stderr, "%s: Could not create aio completion: %s\n", argv[0], strerror(-err));
- rados_ioctx_destroy(io);
- rados_shutdown(cluster);
- exit(1);
- } else {
- printf("\nCreated AIO completion.\n");
- }
- /* Next, read data using rados_aio_read. */
- char read_res[100];
- err = rados_aio_read(io, "hw", comp, read_res, 12, 0);
- if (err < 0) {
- fprintf(stderr, "%s: Cannot read object. %s %s\n", argv[0], poolname, strerror(-err));
- rados_ioctx_destroy(io);
- rados_shutdown(cluster);
- exit(1);
- } else {
- printf("\nRead object \"hw\". The contents are:\n %s \n", read_res);
- }
- /* Wait for the operation to complete */
- rados_wait_for_complete(comp);
- /* Release the asynchronous I/O complete handle to avoid memory leaks. */
- rados_aio_release(comp);
- char xattr_res[100];
- err = rados_getxattr(io, "hw", "lang", xattr_res, 5);
- if (err < 0) {
- fprintf(stderr, "%s: Cannot read xattr. %s %s\n", argv[0], poolname, strerror(-err));
- rados_ioctx_destroy(io);
- rados_shutdown(cluster);
- exit(1);
- } else {
- printf("\nRead xattr \"lang\" for object \"hw\". The contents are:\n %s \n", xattr_res);
- }
- err = rados_rmxattr(io, "hw", "lang");
- if (err < 0) {
- fprintf(stderr, "%s: Cannot remove xattr. %s %s\n", argv[0], poolname, strerror(-err));
- rados_ioctx_destroy(io);
- rados_shutdown(cluster);
- exit(1);
- } else {
- printf("\nRemoved xattr \"lang\" for object \"hw\".\n");
- }
- err = rados_remove(io, "hw");
- if (err < 0) {
- fprintf(stderr, "%s: Cannot remove object. %s %s\n", argv[0], poolname, strerror(-err));
- rados_ioctx_destroy(io);
- rados_shutdown(cluster);
- exit(1);
- } else {
- printf("\nRemoved object \"hw\".\n");
- }
- }
C++ Example
- #include <iostream>
- #include <string>
- #include <rados/librados.hpp>
- int main(int argc, const char **argv)
- {
- /* Continued from previous C++ example, where cluster handle and
- * connection are established. First declare an I/O Context.
- */
- librados::IoCtx io_ctx;
- const char *pool_name = "data";
- {
- ret = cluster.ioctx_create(pool_name, io_ctx);
- if (ret < 0) {
- std::cerr << "Couldn't set up ioctx! error " << ret << std::endl;
- exit(EXIT_FAILURE);
- } else {
- std::cout << "Created an ioctx for the pool." << std::endl;
- }
- }
- /* Write an object synchronously. */
- {
- librados::bufferlist bl;
- bl.append("Hello World!");
- ret = io_ctx.write_full("hw", bl);
- if (ret < 0) {
- std::cerr << "Couldn't write object! error " << ret << std::endl;
- exit(EXIT_FAILURE);
- } else {
- std::cout << "Wrote new object 'hw' " << std::endl;
- }
- }
- /*
- * Add an xattr to the object.
- */
- {
- librados::bufferlist lang_bl;
- lang_bl.append("en_US");
- ret = io_ctx.setxattr("hw", "lang", lang_bl);
- if (ret < 0) {
- std::cerr << "failed to set xattr version entry! error "
- << ret << std::endl;
- exit(EXIT_FAILURE);
- } else {
- std::cout << "Set the xattr 'lang' on our object!" << std::endl;
- }
- }
- /*
- * Read the object back asynchronously.
- */
- {
- librados::bufferlist read_buf;
- int read_len = 4194304;
- //Create I/O Completion.
- librados::AioCompletion *read_completion = librados::Rados::aio_create_completion();
- //Send read request.
- ret = io_ctx.aio_read("hw", read_completion, &read_buf, read_len, 0);
- if (ret < 0) {
- std::cerr << "Couldn't start read object! error " << ret << std::endl;
- exit(EXIT_FAILURE);
- }
- // Wait for the request to complete, and check that it succeeded.
- read_completion->wait_for_complete();
- ret = read_completion->get_return_value();
- if (ret < 0) {
- std::cerr << "Couldn't read object! error " << ret << std::endl;
- exit(EXIT_FAILURE);
- } else {
- std::cout << "Read object hw asynchronously with contents.\n"
- << read_buf.c_str() << std::endl;
- }
- }
- /*
- * Read the xattr.
- */
- {
- librados::bufferlist lang_res;
- ret = io_ctx.getxattr("hw", "lang", lang_res);
- if (ret < 0) {
- std::cerr << "failed to get xattr version entry! error "
- << ret << std::endl;
- exit(EXIT_FAILURE);
- } else {
- std::cout << "Got the xattr 'lang' from object hw!"
- << lang_res.c_str() << std::endl;
- }
- }
- /*
- * Remove the xattr.
- */
- {
- ret = io_ctx.rmxattr("hw", "lang");
- if (ret < 0) {
- std::cerr << "Failed to remove xattr! error "
- << ret << std::endl;
- exit(EXIT_FAILURE);
- } else {
- std::cout << "Removed the xattr 'lang' from our object!" << std::endl;
- }
- }
- /*
- * Remove the object.
- */
- {
- ret = io_ctx.remove("hw");
- if (ret < 0) {
- std::cerr << "Couldn't remove object! error " << ret << std::endl;
- exit(EXIT_FAILURE);
- } else {
- std::cout << "Removed object 'hw'." << std::endl;
- }
- }
- }
Python Example
- print "\n\nI/O Context and Object Operations"
- print "================================="
- print "\nCreating a context for the 'data' pool"
- if not cluster.pool_exists('data'):
- raise RuntimeError('No data pool exists')
- ioctx = cluster.open_ioctx('data')
- print "\nWriting object 'hw' with contents 'Hello World!' to pool 'data'."
- ioctx.write("hw", "Hello World!")
- print "Writing XATTR 'lang' with value 'en_US' to object 'hw'"
- ioctx.set_xattr("hw", "lang", "en_US")
- print "\nWriting object 'bm' with contents 'Bonjour tout le monde!' to pool 'data'."
- ioctx.write("bm", "Bonjour tout le monde!")
- print "Writing XATTR 'lang' with value 'fr_FR' to object 'bm'"
- ioctx.set_xattr("bm", "lang", "fr_FR")
- print "\nContents of object 'hw'\n------------------------"
- print ioctx.read("hw")
- print "\n\nGetting XATTR 'lang' from object 'hw'"
- print ioctx.get_xattr("hw", "lang")
- print "\nContents of object 'bm'\n------------------------"
- print ioctx.read("bm")
- print "Getting XATTR 'lang' from object 'bm'"
- print ioctx.get_xattr("bm", "lang")
- print "\nRemoving object 'hw'"
- ioctx.remove_object("hw")
- print "Removing object 'bm'"
- ioctx.remove_object("bm")
Java-Example
- import com.ceph.rados.Rados;
- import com.ceph.rados.RadosException;
- import java.io.File;
- import com.ceph.rados.IoCTX;
- public class CephClient {
- public static void main (String args[]){
- try {
- Rados cluster = new Rados("admin");
- System.out.println("Created cluster handle.");
- File f = new File("/etc/ceph/ceph.conf");
- cluster.confReadFile(f);
- System.out.println("Read the configuration file.");
- cluster.connect();
- System.out.println("Connected to the cluster.");
- IoCTX io = cluster.ioCtxCreate("data");
- String oidone = "hw";
- String contentone = "Hello World!";
- io.write(oidone, contentone);
- String oidtwo = "bm";
- String contenttwo = "Bonjour tout le monde!";
- io.write(oidtwo, contenttwo);
- String[] objects = io.listObjects();
- for (String object: objects)
- System.out.println(object);
- io.remove(oidone);
- io.remove(oidtwo);
- cluster.ioCtxDestroy(io);
- } catch (RadosException e) {
- System.out.println(e.getMessage() + ": " + e.getReturnValue());
- }
- }
- }
PHP 实例
- <?php
- $io = rados_ioctx_create($r, "mypool");
- rados_write_full($io, "oidOne", "mycontents");
- rados_remove("oidOne");
- rados_ioctx_destroy($io);
Step 4: Closing Sessions
Once your app finishes with the I/O Context and cluster handle, the app shouldclose the connection and shutdown the handle. For asynchronous I/O, the appshould also ensure that pending asynchronous operations have completed.
C Example
- rados_ioctx_destroy(io);
- rados_shutdown(cluster);
C++ Example
- io_ctx.close();
- cluster.shutdown();
Python Example
- print "\nClosing the connection."
- ioctx.close()
- print "Shutting down the handle."
- cluster.shutdown()
PHP 实例
- rados_shutdown($r);