Aliyun Object Storage Service

Slack Docker Pulls GitHub edit source

This guide describes how to configure Aliyun OSS as Alluxio’s under storage system. Object Storage Service (OSS) is a massive, secure and highly reliable cloud storage service provided by Aliyun.

Prerequisites

To run an Alluxio cluster on a set of machines, you must deploy Alluxio binaries to each of these machines. You can either download the precompiled binaries directly with the correct Hadoop version (recommended), or compile the binaries from Alluxio source code (for advanced users).

In preparation for using OSS with Alluxio, follow the OSS quick start guide to sign up for OSS and create a oss bucket.

Basic Setup

To configure Alluxio to use OSS as under storage, you will need to modify the configuration file conf/alluxio-site.properties. If the file does not exist, create the configuration file from the template.

  1. $ cp conf/alluxio-site.properties.template conf/alluxio-site.properties

Edit conf/alluxio-site.properties file to set the under storage address to the OSS bucket and the OSS directory you want to mount to Alluxio. For example, the under storage address can be oss://alluxio-bucket/ if you want to mount the whole bucket to Alluxio, or oss://alluxio-bucket/alluxio/data if only the directory /alluxio/data inside the oss bucket alluxio-bucket is mapped to Alluxio.

  1. alluxio.master.mount.table.root.ufs=oss://<OSS_BUCKET>/<OSS_DIRECTORY>

Specify the Aliyun credentials for OSS access. In conf/alluxio-site.properties, add:

  1. fs.oss.accessKeyId=<OSS_ACCESS_KEY_ID>
  2. fs.oss.accessKeySecret=<OSS_ACCESS_KEY_SECRET>
  3. fs.oss.endpoint=<OSS_ENDPOINT>

fs.oss.accessKeyId and fs.oss.accessKeySecret is the AccessKey for OSS, which are created and managed in Aliyun AccessKey management console.

fs.oss.endpoint is the internet endpoint of this bucket, which can be found in the bucket overview page with values like oss-us-west-1.aliyuncs.com and oss-cn-shanghai.aliyuncs.com. Available endpoints are listed in the OSS Internet Endpoints documentation.

Example: Running Alluxio Locally with OSS

Start the Alluxio servers:

  1. $ ./bin/alluxio format
  2. $ ./bin/alluxio-start.sh local

This will start an Alluxio master and an Alluxio worker. You can see the master UI at http://localhost:19999.

Run a simple example program:

  1. $ ./bin/alluxio runTests

Visit your OSS directory oss://<OSS_BUCKET>/<OSS_DIRECTORY> to verify the files and directories created by Alluxio exist. For this test, you should see files named like <OSS_BUCKET>/<OSS_DIRECTORY>/default_tests_files/BasicFile_CACHE_PROMOTE_MUST_CACHE.

Stop Alluxio by running:

  1. $ ./bin/alluxio-stop.sh local

Advanced Setup

Nested Mount

An OSS location can be mounted at a nested directory in the Alluxio namespace to have unified access to multiple under storage systems. Alluxio’s Mount Command can be used for this purpose. For example, the following command mounts a directory inside an OSS bucket into Alluxio directory /oss:

  1. $ ./bin/alluxio fs mount --option fs.oss.accessKeyId=<OSS_ACCESS_KEY_ID> \
  2. --option fs.oss.accessKeySecret=<OSS_ACCESS_KEY_SECRET> \
  3. --option fs.oss.endpoint=<OSS_ENDPOINT> \
  4. /oss oss://<OSS_BUCKET>/<OSS_DIRECTORY>/