Deploy Alluxio on Kubernetes
Alluxio can be run on Kubernetes. This guide demonstrates how to run Alluxio on Kubernetes using the specification that comes in the Alluxio Github repository.
Basic Tutorial
This tutorial walks through a basic Alluxio setup on Kubernetes.
Prerequisites
- A Kubernetes cluster (version >= 1.8). Alluxio workers will use
emptyDir
volumes with a restricted size using thesizeLimit
parameter. This is an alpha feature in Kubernetes 1.8. Please ensure the feature is enabled. - An Alluxio Docker image. Refer to this page for instructions to build an image. The image must be available for a pull from all Kubernetes hosts running Alluxio processes. This can be achieved by pushing the image to an accessible Docker registry, or pushing the image individually to all hosts. If using a private Docker registry, refer to the Kubernetes documentation.
Clone the Alluxio repo
git clone https://github.com/Alluxio/alluxio.git
cd integration/kubernetes
The kubernetes specifications required to deploy Alluxio can be found under integration/kubernetes
.
Enable short-circuit operations
Short-circuit access enables clients to perform read and write operations directly against the worker memory instead of having to go through the worker process. Set up a domain socket on all hosts eligible to run the Alluxio worker process to enable this mode of operation.
From the host machine, create a directory for the shared domain socket.
mkdir /tmp/domain
chmod a+w /tmp/domain
This step can be skipped in case short-circuit accesss is not desired or cannot be set up. To disable this feature, set the property alluxio.user.short.circuit.enabled=false
according to the instructions in the configuration section below.
By default, short-circuit operations between the Alluxio client and worker are enabled if the client hostname matches the worker hostname. This may not be true if the client is running as part of a container with virtual networking. In such a scenario, set the following property to use filesystem inspection to enable short-circuit. Short-circuit writes are then enabled if the worker UUID is located on the client filesystem.
alluxio.worker.data.server.domain.socket.as.uuid=true
alluxio.worker.data.server.domain.socket.address=/tmp/domain
Provision a Persistent Volume
Alluxio master can be configured to use a persistent volume for storing the journal. The volume, once claimed, is persisted across restarts of the master process.
Create the persistent volume spec from the template. The access mode ReadWriteMany
is used to allow multiple Alluxio master nodes to access the shared volume.
cp alluxio-journal-volume.yaml.template alluxio-journal-volume.yaml
Note: the spec provided uses a hostPath
volume for demonstration on a single-node deployment. For a multi-node cluster, you may choose to use NFS, AWSElasticBlockStore, GCEPersistentDisk or other available persistent volume plugins.
Create the persistent volume.
kubectl create -f alluxio-journal-volume.yaml
Configure Alluxio properties
Alluxio containers in Kubernetes use environment variables to set Alluxio properties. Refer to Docker configuration for the corresponding environment variable name for Alluxio properties in conf/alluxio-site.properties
.
Define all environment variables in a single file. Copy the properties template at integration/kubernetes/conf
, and modify or add any configuration properties as required. Note that when running Alluxio with host networking, the ports assigned to Alluxio services must not be occupied beforehand.
cp conf/alluxio.properties.template conf/alluxio.properties
Create a ConfigMap.
kubectl create configmap alluxio-config --from-env-file=ALLUXIO_CONFIG=conf/alluxio.properties
Deploy
Prepare the Alluxio deployment specs from the templates. Modify any parameters required, such as location of the Docker image, and CPU and memory requirements for pods.
cp alluxio-master.yaml.template alluxio-master.yaml
cp alluxio-worker.yaml.template alluxio-worker.yaml
Once all the pre-requisites and configuration have been setup, deploy Alluxio.
kubectl create -f alluxio-master.yaml
kubectl create -f alluxio-worker.yaml
Verify status of the Alluxio deployment.
kubectl get pods
If using peristent volumes for Alluxio master, the status of the volume should change to CLAIMED
.
kubectl get pv alluxio-journal-volume
Verify
Once ready, access the Alluxio CLI from the master pod and run basic I/O tests.
kubectl exec -ti alluxio-master-0 /bin/bash
From the master pod, execute the following:
cd /opt/alluxio
./bin/alluxio runTests
Uninstall
Uninstall Alluxio:
kubectl delete -f alluxio-worker.yaml
kubectl delete -f alluxio-master.yaml
kubectl delete configmaps alluxio-config
Execute the following to remove the persistent volume storing the Alluxio journal. Note: Alluxio metadata will be lost.
kubectl delete -f alluxio-journal-volume.yaml