Under File Storage Extension API
This page is intended for developers of under storage extensions. Please look at managing extensions for a guide to using existing extensions.
Introduction
Under storage extensions provide a framework to enable additional storage systems to work with Alluxio and makes it convenient to develop modules not already supported by Alluxio. Extensions are built as JARs and included at a specific extensions location to be picked up by core Alluxio. This page describes the mechanics of how extensions in Alluxio work, and provides detailed instructions for developing an under storage extension.
If the modules included in core Alluxio do not use the interface supported by your desired storage system, you may choose to implement an under storage extension.
Implementing an Under Storage Extension
Building a new under storage connector involves:
- Implementing the required under storage interface
- Declaring the service implementation
- Bundling up the implementation and transitive dependencies in an uber JAR
A reference implementation can be found in the alluxio-extensions repository. In the rest of this section, we describe the steps involved in writing a new under storage extension. The sample project, called DummyUnderFileSystem
, uses maven as the build and dependency management tool, and forwards all operations to a local filesystem.
Implement the Under Storage Interface
The HDFS Submodule and S3 Submodule are good examples of how to enable a storage system to serve as Alluxio’s underlying storage.
Step 1: Implement the interface UnderFileSystem
The UnderFileSystem
interface is defined in the module org.alluxio:alluxio-core-common
. Choose to extend either ConsistentUnderFileSystem
or ObjectUnderFileSystem
to implement the UnderFileSystem
interface.
ConsistentUnderFileSystem
: used for storage like HDFS which is not eventually consistent.ObjectUnderFileSystem
: suitable for connecting to object storage and abstracts away mapping file system operations to an object store.
public class DummyUnderFileSystem extends ConsistentUnderFileSystem {
// Implement filesystem operations
...
}
or,
public class DummyUnderFileSystem extends ObjectUnderFileSystem {
// Implement object store operations
...
}
Step 2: Implement the interface UnderFileSystemFactory
The under storage factory determines defines which paths the UnderFileSystem
implementation supports and how to create the UnderFileSystem
implementation.
public class DummyUnderFileSystemFactory implements UnderFileSystemFactory {
...
@Override
public UnderFileSystem create(String path, UnderFileSystemConfiguration conf) {
// Create the under storage instance
}
@Override
public boolean supportsPath(String path) {
// Choose which schemes to support, e.g., dummy://
}
}
Step 3: Define any properties required to configure the UnderFileSystem
.
public class DummyUnderFileSystemPropertyKey {
public static final PropertyKey DUMMY_UFS_PROPERTY =
new PropertyKey.Builder(Name.DUMMY_UFS_PROPERTY)
.setDescription("...")
.setDefaultValue("...")
.build();
public static final class Name {
public static final String DUMMY_UFS_PROPERTY = "fs.dummy.property";
}
}
Declare the Service
Create a file at src/main/resources/META-INF/services/alluxio.underfs.UnderFileSystemFactory
advertising the implemented UnderFileSystemFactory
to the ServiceLoader.
alluxio.underfs.dummy.DummyUnderFileSystemFactory
Build
Include all transitive dependencies of the extension project in the built JAR using either maven-shade-plugin
or maven-assembly
.
In addition, to avoid collisions, specify scope for the dependency alluxio-core-common
as provided
. The maven definition would look like:
<dependencies>
<!-- Core Alluxio dependencies -->
<dependency>
<groupId>org.alluxio</groupId>
<artifactId>alluxio-core-common</artifactId>
<scope>provided</scope>
</dependency>
...
</dependencies>
Build the tarball:
$ mvn package
Install to Alluxio
Install the tarball to Alluxio:
$ ./bin/alluxio extensions install <path>/<to>/<probject>/target/alluxio-underfs-<ufsName>-<version>.jar
Test the Under Storage Extension
To ensure the new under storage module fulfills the minimum requirements to work with Alluxio, one can run contract tests to test different workflows with various combinations of operations against the under storage.
$ ./bin/alluxio runUfsTests --path <scheme>://<path>/ -D<key>=<value>
In addition, one can also mount the under storage and run other kinds of tests on it. Please refer to managing extensions.
How it Works
Service Discovery
Extension JARs are loaded dynamically at runtime by Alluxio servers, which enables Alluxio to talk to new under storage systems without requiring a restart. Alluxio servers use Java ServiceLoader to discover implementations of the under storage API. Providers include implementations of the alluxio.underfs.UnderFileSystemFactory
interface. The implementation is advertised by including a text file in META_INF/services
with a single line pointing to the class implementing the said interface.
Dependency Management
Implementors are required to include transitive dependencies in their extension JARs. Alluxio performs isolated classloading for each extension JARs to avoid dependency conflicts between Alluxio servers and extensions.
Contributing your Under Storage extension to Alluxio
Congratulations! You have developed a new under storage extension to Alluxio. Let the community know by submitting a pull request to the Alluxio repository to edit the list of extensions section on the documentation page.