HttpFS Gateway

Ozone HttpFS can be used to integrate Ozone with other tools via REST API.

Introduction

Ozone HttpFS is forked from the HDFS HttpFS endpoint implementation (HDDS-5448). Ozone HttpFS is intended to be added optionally as a role in an Ozone cluster, similar to S3 Gateway.

HttpFS is a service that provides a REST HTTP gateway supporting File System operations (read and write). It is interoperable with the webhdfs REST HTTP API.

HttpFS can be used to access data on an Ozone cluster behind of a firewall. For example, the HttpFS service acts as a gateway and is the only system that is allowed to cross the firewall into the cluster.

HttpFS can be used to access data in Ozone using HTTP utilities (such as curl and wget) and HTTP libraries Perl from other languages than Java.

The webhdfs client FileSystem implementation can be used to access HttpFS using the Ozone filesystem command line tool (ozone fs) as well as from Java applications using the Hadoop FileSystem Java API.

HttpFS has built-in security supporting Hadoop pseudo authentication and Kerberos SPNEGO and other pluggable authentication mechanisms. It also provides Hadoop proxy user support.

Getting started

HttpFS service itself is a Jetty based web-application that uses the Hadoop FileSystem API to talk to the cluster, it is a separate service which provides access to Ozone via a REST APIs. It should be started in addition to other regular Ozone components.

To try it out, you can start a Docker Compose dev cluster that has an HttpFS gateway.

Extract the release tarball, go to the compose/ozone directory and start the cluster:

  1. docker-compose up -d --scale datanode=3

You can/should find now the HttpFS gateway in docker with the name ozone_httpfs. HttpFS HTTP web-service API calls are HTTP REST calls that map to an Ozone file system operation. For example, using the curl Unix command.

E.g. in the docker cluster you can execute commands like these:

  • curl -i -X PUT "http://httpfs:14000/webhdfs/v1/vol1?op=MKDIRS&user.name=hdfs" creates a volume called vol1.

  • $ curl 'http://httpfs-host:14000/webhdfs/v1/user/foo/README.txt?op=OPEN&user.name=foo' returns the content of the key /user/foo/README.txt.

Supported operations

Here are the tables of WebHDFS REST APIs and their state of support in Ozone.

File and Directory Operations

OperationSupport
Create and Write to a Filesupported
Append to a Filenot implemented in Ozone
Concat File(s)not implemented in Ozone
Open and Read a Filesupported
Make a Directorysupported
Create a Symbolic Linknot implemented in Ozone
Rename a File/Directorysupported (with limitations)
Delete a File/Directorysupported
Truncate a Filenot implemented in Ozone
Status of a File/Directorysupported
List a Directorysupported
List a Filesupported
Iteratively List a Directorysupported

Other File System Operations

OperationSupport
Get Content Summary of a Directorysupported
Get Quota Usage of a Directorysupported
Set Quotanot implemented in Ozone FileSystem API
Set Quota By Storage Typenot implemented in Ozone
Get File Checksumunsupported (to be fixed)
Get Home Directoryunsupported (to be fixed)
Get Trash Rootunsupported
Set Permissionnot implemented in Ozone FileSystem API
Set Ownernot implemented in Ozone FileSystem API
Set Replication Factornot implemented in Ozone FileSystem API
Set Access or Modification Timenot implemented in Ozone FileSystem API
Modify ACL Entriesnot implemented in Ozone FileSystem API
Remove ACL Entriesnot implemented in Ozone FileSystem API
Remove Default ACLnot implemented in Ozone FileSystem API
Remove ACLnot implemented in Ozone FileSystem API
Set ACLnot implemented in Ozone FileSystem API
Get ACL Statusnot implemented in Ozone FileSystem API
Check accessnot implemented in Ozone FileSystem API

Hadoop user and developer documentation about HttpFS