Usage - Distributed Coach

Coach supports the horizontal scale-out of rollout workers in distributed mode. For more information on the design andimplementation of distributed Coach, see Distributed Coach - Horizontal Scale-Out. In the rest of this section, we will describe how toget started with distributed Coach.

Interfaces and Implementations

Coach uses three interfaces to orchestrate, schedule and manager the resources of workers it spawns in the distributedmode. These interfaces are the orchestrator, memory backend and the data store. Refer to Distributed Coach - Horizontal Scale-Out formore information. The following implementation(s) are available for each interface:

Prerequisites

Clone the Repository

  1. $ git clone git@github.com:NervanaSystems/coach.git
  2. $ cd coach

Build Container Image and Push

Create a directory docker.

  1. $ mkdir docker

Create docker files in the docker directory.

A sample base docker file (Dockerfile.base) would look like this:

  1. FROM nvidia/cuda:9.0-cudnn7-runtime-ubuntu16.04
  2.  
  3. ################################
  4. # Install apt-get Requirements #
  5. ################################
  6.  
  7. # General
  8. RUN apt-get update && \
  9. apt-get install -y python3-pip cmake zlib1g-dev python3-tk python-opencv \
  10. # Boost libraries
  11. libboost-all-dev \
  12. # Scipy requirements
  13. libblas-dev liblapack-dev libatlas-base-dev gfortran \
  14. # Pygame requirements
  15. libsdl-dev libsdl-image1.2-dev libsdl-mixer1.2-dev libsdl-ttf2.0-dev \
  16. libsmpeg-dev libportmidi-dev libavformat-dev libswscale-dev \
  17. # Dashboard
  18. dpkg-dev build-essential python3.5-dev libjpeg-dev libtiff-dev libsdl1.2-dev libnotify-dev \
  19. freeglut3 freeglut3-dev libsm-dev libgtk2.0-dev libgtk-3-dev libwebkitgtk-dev libgtk-3-dev \
  20. libwebkitgtk-3.0-dev libgstreamer-plugins-base1.0-dev \
  21. # Gym
  22. libav-tools libsdl2-dev swig cmake \
  23. # Mujoco_py
  24. curl libgl1-mesa-dev libgl1-mesa-glx libglew-dev libosmesa6-dev software-properties-common \
  25. # ViZDoom
  26. build-essential zlib1g-dev libsdl2-dev libjpeg-dev \
  27. nasm tar libbz2-dev libgtk2.0-dev cmake git libfluidsynth-dev libgme-dev \
  28. libopenal-dev timidity libwildmidi-dev unzip wget && \
  29. apt-get clean autoclean && \
  30. apt-get autoremove -y
  31.  
  32. ############################
  33. # Install Pip Requirements #
  34. ############################
  35. RUN pip3 install --upgrade pip
  36. RUN pip3 install setuptools==39.1.0 && pip3 install pytest && pip3 install pytest-xdist
  37.  
  38. RUN curl -o /usr/local/bin/patchelf https://s3-us-west-2.amazonaws.com/openai-sci-artifacts/manual-builds/patchelf_0.9_amd64.elf \
  39. && chmod +x /usr/local/bin/patchelf

A sample docker file for the gym environment would look like this:

  1. FROM coach-base:master as builder
  2.  
  3. # prep gym and any of its related requirements.
  4. RUN pip3 install gym[atari,box2d,classic_control]==0.10.5
  5.  
  6. # add coach source starting with files that could trigger
  7. # re-build if dependencies change.
  8. RUN mkdir /root/src
  9. COPY setup.py /root/src/.
  10. COPY requirements.txt /root/src/.
  11. RUN pip3 install -r /root/src/requirements.txt
  12.  
  13. FROM coach-base:master
  14. WORKDIR /root/src
  15. COPY --from=builder /root/.cache /root/.cache
  16. COPY setup.py /root/src/.
  17. COPY requirements.txt /root/src/.
  18. COPY README.md /root/src/.
  19. RUN pip3 install gym[atari,box2d,classic_control]==0.10.5 && pip3 install -e .[all] && rm -rf /root/.cache
  20. COPY . /root/src

A sample docker file for the Mujoco environment would look like this:

  1. FROM coach-base:master as builder
  2.  
  3. # prep mujoco and any of its related requirements.
  4. # Mujoco
  5. RUN mkdir -p ~/.mujoco \
  6. && wget https://www.roboti.us/download/mjpro150_linux.zip -O mujoco.zip \
  7. && unzip -n mujoco.zip -d ~/.mujoco \
  8. && rm mujoco.zip
  9. ARG MUJOCO_KEY
  10. ENV MUJOCO_KEY=$MUJOCO_KEY
  11. ENV LD_LIBRARY_PATH /root/.mujoco/mjpro150/bin:$LD_LIBRARY_PATH
  12. RUN echo $MUJOCO_KEY | base64 --decode > /root/.mujoco/mjkey.txt
  13. RUN pip3 install mujoco_py
  14.  
  15. # add coach source starting with files that could trigger
  16. # re-build if dependencies change.
  17. RUN mkdir /root/src
  18. COPY setup.py /root/src/.
  19. COPY requirements.txt /root/src/.
  20. RUN pip3 install -r /root/src/requirements.txt
  21.  
  22. FROM coach-base:master
  23. WORKDIR /root/src
  24. COPY --from=builder /root/.mujoco /root/.mujoco
  25. ENV LD_LIBRARY_PATH /root/.mujoco/mjpro150/bin:$LD_LIBRARY_PATH
  26. COPY --from=builder /root/.cache /root/.cache
  27. COPY setup.py /root/src/.
  28. COPY requirements.txt /root/src/.
  29. COPY README.md /root/src/.
  30. RUN pip3 install mujoco_py && pip3 install -e .[all] && rm -rf /root/.cache
  31. COPY . /root/src

A sample docker file for the ViZDoom environment would look like this:

  1. FROM coach-base:master as builder
  2.  
  3. # prep vizdoom and any of its related requirements.
  4. RUN pip3 install vizdoom
  5.  
  6. # add coach source starting with files that could trigger
  7. # re-build if dependencies change.
  8. RUN mkdir /root/src
  9. COPY setup.py /root/src/.
  10. COPY requirements.txt /root/src/.
  11. RUN pip3 install -r /root/src/requirements.txt
  12.  
  13. FROM coach-base:master
  14. WORKDIR /root/src
  15. COPY --from=builder /root/.cache /root/.cache
  16. COPY setup.py /root/src/.
  17. COPY requirements.txt /root/src/.
  18. COPY README.md /root/src/.
  19. RUN pip3 install vizdoom && pip3 install -e .[all] && rm -rf /root/.cache
  20. COPY . /root/src

Build the base container. Make sure you are in the Coach root directory before building.

  1. $ docker build -t coach-base:master -f docker/Dockerfile.base .

If you would like to use the Mujoco environment, save this key as an environment variable. Replace <mujoco_key> with thecontents of your mujoco key.

  1. $ export MUJOCO_KEY=<mujoco_key>

Build the container for your environment.Replace <env> with your choice of environment. The choices are gym, mujoco and doom.Replace <user-name>, <image-name> and <tag> with appropriate values.

  1. $ docker build --build-arg MUJOCO_KEY=${MUJOCO_KEY} -t <user-name>/<image-name>:<tag> -f docker/Dockerfile.<env> .

Push the container to a registry of your choice. Replace <user-name>, <image-name> and <tag> with appropriate values.

  1. $ docker push <user-name>/<image-name>:<tag>

Create a Config file

Add the following contents to file.Replace <user-name>, <image-name>, <tag>, <bucket-name> and <path-to-aws-credentials> with appropriate values.

  1. [coach]
  2. image = <user-name>/<image-name>:<tag>
  3. memory_backend = redispubsub
  4. data_store = s3
  5. s3_end_point = s3.amazonaws.com
  6. s3_bucket_name = <bucket-name>
  7. s3_creds_file = <path-to-aws-credentials>

Run Distributed Coach

The following command will run distributed Coach with CartPole_ClippedPPO preset, Redis Pub/Sub as the memory backend, S3 as the data store in Kuberneteswith three rollout workers.

  1. $ python3 rl_coach/coach.py -p CartPole_ClippedPPO \
  2. -dc \
  3. -e <experiment-name> \
  4. -n 3 \
  5. -dcp <path-to-config-file>