Marathon Configuration Reference
Understanding Marathon application definitions
This topic lists all available properties for Marathon application definitions and an example JSON application definition file with all properties shown.
- Marathon Properties - acceptedResourceRoles - args - backoffFactor - backoffSeconds - cmd - constraints - container - cpus - dependencies - disk - env - executor - fetch - gpus - healthChecks - id - instances - labels - maxLaunchDelaySeconds - mem - networks - portDefinitions - requirePorts - residency - resourceLimits - taskKillGracePeriodSeconds - unreachableStrategy - upgradeStrategy
- Example
Marathon Properties
acceptedResourceRoles
An array of resource roles. Marathon considers only resource offers with roles in this list for launching tasks of this app. For more information, see the Mesos documentation.
args
An array of strings that specifies the command to run. The args
field may be used in place of cmd
even when using the default command executor.
IMPORTANT: You must specify either cmd
or args
in all app definitions. It is invalid to supply both cmd
and args
in the same app.
backoffFactor
The multiplicand to apply to the backoffSeconds
value. The default value is 1.15
. The backoffSeconds
and backoffFactor
values are multiplied until they reach the maxLaunchDelaySeconds
value. After they reach that value, Marathon waits maxLaunchDelaySeconds
before repeating this cycle exponentially. For example, if backoffSeconds: 3
, backoffFactor: 2
, and maxLaunchDelaySeconds: 300
, there will be ten attempts to launch a failed task, each three seconds apart. After these ten attempts, Marathon will wait 300 seconds before repeating this cycle.
This prevents sandboxes associated with consecutively failing tasks from filling up the hard disk on Mesos slaves. This applies also to tasks that are killed due to failing too many health checks.
backoffSeconds
The amount of time (in seconds) before Marathon retries launching a failed task. The default is 1
. The backoffSeconds
and backoffFactor
values are multiplied until they reach the maxLaunchDelaySeconds
value. After they reach that value, Marathon waits maxLaunchDelaySeconds
before repeating this cycle exponentially. For example, if backoffSeconds: 3
, backoffFactor: 2
, and maxLaunchDelaySeconds: 300
, there will be ten attempts to launch a failed task, each three seconds apart. After these ten attempts, Marathon will wait 300 seconds before repeating this cycle.
This prevents sandboxes associated with consecutively failing tasks from filling up the hard disk on Mesos agents. This applies also to tasks that are killed due to failing too many health checks.
cmd
The command that is executed. This value is wrapped by Mesos via /bin/sh -c ${app.cmd}
.
IMPORTANT: You must specify either cmd
or args
in all app definitions. It is invalid to supply both cmd
and args
in the same app.
constraints
Constraint operators that control where apps can run that allow you to optimize for either fault tolerance or locality. For more information, see the Constraints documentation.
container
The container information.
type The containerizer runtime type, either
MESOS
orDOCKER
. For more information, see Using Containerizers.portMappings An array of port mappings between host and container. A port mapping is similar to passing
-p
into the Docker command line to specify a relationship between a port on the host machine and a port inside the container. If unspecified (null) at create time, defaults to { “portMappings”: [ { “containerPort”: 0, “name”: “default” } ], … }. Specify an empty array ([]) to indicate no ports are used by the app; no default is injected in this case.A port mapping consists of:
- containerPort The container port (e.g.,
8080
). - hostPort The host port (e.g.,
0
). The default value is0
. In networking modecontainer
, thehostPort
is not required, but if left unspecified Marathon will not randomly allocate a port. When usingcontainer/bridge
mode, an unspecified (null) value forhostPort
setshostPort: 0
. - servicePort The service port (e.g.,
9000
). - protocol The HTTP protocol, either
tcp
orudp
.
Port mappings are used in conjunction with
container
andcontainer/bridge
networking mode and ignored when used in conjunction withhost
networking mode. When used in conjunction with multiplecontainer
networks, each mapping entry that specifies ahostPort
must also declare aname
that identifies the network for which the mapping applies (a singlehostPort
may be mapped to only one container network, andname
defaults to all container networks for a pod or app).requirePorts
does not apply toportMappings
.- Future versions of Marathon may fail to validate apps that declare
container.portMappings
with network modes other thancontainer
orcontainer/bridge
.
- containerPort The container port (e.g.,
docker The Docker container information.
- forcePullImage Whether to pull the image, regardless if it is already available on the local system.
- image The path to the Docker image.
- privileged Whether to give extended privileges to this container. For more information, see the Docker run command.
"privileged": false
Do not give extended privileges. This is the default value."privileged": true
Give extended privileges.
- parameters Command-line options for the
docker run
command executed by the Mesos containerizer. Parameters passed in this manner are not guaranteed to be supported in the future, as Mesos may not always interact with Docker via the CLI. - pullConfig A secret whose value is a stringified JSON object in a Secret Store. See Using a Private Docker Registry.
volumes The volumes accessible to the container.
- containerPath The path where your container will read and write data.
- external An external persistent volume. See External Persistent Volumes.
- name Name that your volume driver uses to look up the external volume.
- provider The storage provider.
- options Which Docker volume driver to use for storage. The only Docker volume driver supported by DC/OS is REX-Ray.
- size The size (in GiB) of the external persistent volume.
- hostPath The host path.
- mode The access mode of the volume, either read-write (
RW
) or read-only (RO
). - persistent A local persistent volume. See Local Persistent Volumes.
- size The size (in MiB) of the local persistent volume.
cpus
The number of CPU shares per instance. A decimal fraction or integer.
dependencies
A list of services upon which this application depends. The order to start, stop, and upgrade the application is derived from the dependencies. For example, suppose application /a
relies on service /b
which relies on /c
. To start all 3 applications, first /c
is started, then /b
and /a
.
disk
The amount of disk space needed for the application. A decimal fraction or integer MB.
env
Environment variables.
executor
The executor used to launch the application. The default is //cmd
, which takes the cmd
and executes that on the shell level.
fetch
An array of URIs to fetch. For more information, see the Mesos Fetcher documentation.
A URI consists of:
- uri URI to be fetched by Mesos fetcher module.
- executable Set fetched artifact as executable.
- extract Extract fetched artifact if supported by Mesos fetcher module.
- cache Cache fetched artifact if supported by Mesos fetcher module.
gpus
The number of GPU cores needed per instance.
This property is only applicable if you are using DC/OS Universal Container Runtime (UCR) containers. Support for GPU resources is not available for Docker containers.
healthChecks
An array of checks that are run against an application’s tasks. Marathon health checks perform periodic checks on the containers distributed across a cluster to make sure they’re up and responding. For more information, see the Health Checks documentation.
A health check consists of:
- gracePeriodSeconds Specifies the amount of time (in seconds) to ignore health checks immediately after a task is started; or until the task becomes healthy for the first time.
- intervalSeconds Specifies the amount of time (in seconds) to wait between health checks.
- maxConsecutiveFailures Specifies the number of consecutive health check failures that can occur before a task is killed.
- path If
"protocol": "HTTP"
, this option specifies the path to the task health status endpoint. For example,"/path/to/health"
. - portIndex Specifies the port index in the ports array that is used for health requests. A port index allows the app to use any port, such as
"[0, 0, 0]"
and tasks could be started with port environment variables such as$PORT1
. - protocol Specifies the protocol of the requests:
HTTP
,HTTPS
,TCP
, orCommand
. - timeoutSeconds Specifies the amount of time (in seconds) before a health check fails, regardless of the response.
id
(Required) Unique identifier for the app consisting of a series of names separated by slashes. Each name must be at least 1 character and may only contain digits (0-9), dashes (-), dots (.), and lowercase letters (a-z). The name may not begin or end with a dash.
The allowable format is represented by the following regular expression:
^(([a-z0-9]|[a-z0-9][a-z0-9\\-]*[a-z0-9])\\.)*([a-z0-9]|[a-z0-9][a-z0-9\\-]*[a-z0-9])$
instances
The number of instances of this application to start. You can change this number as needed to scale the application.
labels
Metadata to expose additional information to other services. For example, you could label apps "environment": "staging"
to mark services by their position in the pipeline.
maxLaunchDelaySeconds
The default value of maxLaunchDelaySeconds
is 300
starting with DC/OS 1.13.0.
The maximum amount of time (in seconds) to wait, after applying the backoffSeconds
and backoffFactor
values, before attempting to restart failed tasks. The backoffSeconds
and backoffFactor
values are multiplied until they reach the maxLaunchDelaySeconds
value. After they reach that value, Marathon waits maxLaunchDelaySeconds
before repeating this cycle exponentially. For example, if backoffSeconds: 3
, backoffFactor: 2
, and maxLaunchDelaySeconds: 300
, there will be ten attempts to launch a failed task, each three seconds apart. After these ten attempts, Marathon will wait 300 seconds before repeating this cycle.
This prevents sandboxes associated with consecutively failing tasks from filling up the hard disk on Mesos slaves. This applies also to tasks that are killed due to failing too many health checks.
mem
The amount of memory (MB) required per instance.
networks
An array of network definitions. An application can specify more than one network only when using the Universal Container Runtime (MESOS
) containerizer runtime. Although Docker supports multiple networks per container, the Docker Engine containerizer runtime does not support multiple networks.
A network definition consists of:
- mode Networking mode. Three modes of networking are supported:
host
,container
,container/bridge
. An application cannot mix networking modes: you must specify a singlehost
network, a singlecontainer/bridge
network, or one or morecontainer
networks. - name Name of the network. Required when mode is
container
. - labels See labels.
portDefinitions
An array of required port resources on the agent host. The number of items in the array determines how many dynamic ports are allocated for every task. For every port definition with port number zero, a globally unique (cluster-wide) service port is assigned and provided as part of the app definition to be used in load balancing definitions. For more information, see the Networking documentation.
A port definition consists of:
- port An integer in the range 0, 65535.
- name Name of the service hosted on this port. If specified, it must be unique over all port definitions.
- labels Metadata to be interpreted by external applications such as firewalls.
- protocol The HTTP protocol, either
tcp
orudp
.
Each port value is exposed to the instance via environment variables $PORT0
, $PORT1
, etc. Ports assigned to running instances are also available via the task resource.
Port definitions are used only with host
networking mode. A port definition (specifically its port field) is interpreted through the lens of the requirePorts
field. When requirePorts
is false (default), a port definition’s port is considered the service port and a host port is dynamically chosen by Marathon. When requirePorts
is true, a port definition’s port is considered both a host port and service port.
The special port value of 0 tells Marathon to select any host port from a Mesos resource offer and any service port from the configured service port range.
You configure ports assignment for Docker containers in container.portMappings
. If you use the Universal Container Runtime, pass zeros as port values to generate one or more arbitrary free ports for each application instance. For more information, see Containerizers.
requirePorts
Whether the host ports of your tasks are automatically assigned.
"requirePorts": false
Ports are automatically assigned."requirePorts": true
Manually specify ports in advance. Marathon will only schedule the associated tasks on hosts that have the specified ports available.
residency
Set up a stateful application. For more information, see local persistent volumes. Deprecated.
taskLostBehavior Indicates whether Marathon will launch the task on another node after receiving a
TASK_LOST
status update.- WAIT_FOREVER Do not relaunch the task after receiving a
TASK_LOST
status update. This setting is required to create a persistent volume. This is the default value. - RELAUNCH_AFTER_TIMEOUT Relaunch the task after receiving a
TASK_LOST
status update.
- WAIT_FOREVER Do not relaunch the task after receiving a
resourceLimits
Specify optional resource limits for a container, allowing the task to consume more cpu and memory resources than requested, if available. Each limit is specified as either a numerical value, or as the string "unlimited"
.
taskKillGracePeriodSeconds
The amount of time (in seconds) between the executor sending SIGTERM to a task and then sending SIGKILL.
unreachableStrategy
Define handling for unreachable instances. The value is a string or an object. The string is "disabled"
, which disables handling for unreachable instances. If inactiveAfterSeconds = 60
and expungeAfterSeconds = 120
, an instance will be expunged after it has been unreachable for more than 120 seconds and a second instance will be started if it has been unreachable for more than 60 seconds.
- inactiveAfterSeconds - If an instance is unreachable for longer than
inactiveAfterSeconds
it is marked as inactive. This will trigger a new instance launch. Must be less than or equal toexpungeAfterSeconds
. The default value is 0 seconds. - expungeAfterSeconds - If an instance is unreachable for longer than
expungeAfterSeconds
it will be expunged. That means it will be killed if it ever comes back. Instances are usually marked as unreachable before they are expunged but they don’t have to. This value is required to be greater thaninactiveAfterSeconds
unless both are zero. If the instance has any persistent volumes associated with it, then they will be destroyed and associated data will be deleted. The default value is 0 seconds.
upgradeStrategy
The strategy that controls when Marathon stops old versions and launches new versions. During an upgrade all instances of an application are replaced by a new version.
- minimumHealthCapacity - The minimum percentage (expressed as a decimal fraction between
0.0
and1.0
) of nodes that remain healthy during an upgrade. During an upgrade, Marathon ensures that this number of healthy instances are up. The default is1.0
, which means no old instance can be stopped before another healthy new version is deployed. A value of0.5
means that during an upgrade half of the old version instances are stopped first to make space for the new version. A value of0
means take all instances down immediately and replace with the new application. - maximumOverCapacity - The maximum percentage (expressed as a decimal fraction between
0.0
and1.0
) of new instances that can be launched at any point during an upgrade. The default value is1
, which means that all old and new instances can exist during the upgrade process. A value of0.1
means that during the upgrade process 10% more capacity than usual may be used for old and new instances. A value of0.0
means that even during the upgrade process no more capacity may be used for the new instances than usual. Only when an old version is stopped, a new instance can be deployed.
If "minimumHealthCapacity": 1
and "maximumOverCapacity": 0
, at least one additional new instance is launched in the beginning of the upgrade process. When it is healthy, one of the old instances is stopped. After it is stopped, another new instance is started, and so on.
A combination of "minimumHealthCapacity": 0.9
and "maximumOverCapacity": 0
results in a rolling update, replacing 10% of the instances at a time, keeping at least 90% of the app online at any point of time during the upgrade.
A combination of "minimumHealthCapacity": 1
and "maximumOverCapacity": 0.1
results in a rolling update, replacing 10% of the instances at a time and keeping at least 100% of the app online at any point of time during the upgrade with 10% of additional capacity.
Example
Here is an example JSON application definition that contains all fields.
{
"id": "/product/service/myApp",
"cmd": "env && sleep 300",
"cpus": 1.5,
"mem": 256.0,
"resourceLimits": { "cpus": "unlimited", "mem": 2048.0 },
"portDefinitions": [
{ "port": 8080, "protocol": "tcp", "name": "http", "labels": { "VIP_0": "10.0.0.1:80" } },
{ "port": 9000, "protocol": "tcp", "name": "admin" }
],
"requirePorts": false,
"instances": 3,
"executor": "",
"container": {
"type": "DOCKER",
"docker": {
"image": "group/image",
"privileged": false,
"parameters": [
{ "key": "a-docker-option", "value": "xxx" },
{ "key": "b-docker-option", "value": "yyy" }
]
},
"portMappings": [
{
"containerPort": 8080,
"hostPort": 0,
"servicePort": 9000,
"protocol": "tcp"
},
{
"containerPort": 161,
"hostPort": 0,
"protocol": "udp"
}
],
"volumes": [
{
"containerPath": "data",
"hostPath": "mydata",
"mode": "RO",
"persistent": {
"size": 10
}
},
{
"containerPath": "test-rexray-volume",
"external": {
"size": 100,
"name": "my-test-vol",
"provider": "dvdi",
"options": { "dvdi/driver": "rexray" }
},
"mode": "RW"
}
]
},
"residency": {
"taskLostBehavior": "WAIT_FOREVER"
},
"env": {
"LD_LIBRARY_PATH": "/usr/local/lib/myLib"
},
"constraints": [
["attribute", "$OPERATOR", "value"]
],
"acceptedResourceRoles": [
"role1", "*"
],
"labels": {
"environment": "staging"
},
"fetch": [
{ "uri": "https://raw.github.com/mesosphere/marathon/master/README.md" },
{ "uri": "https://foo.com/archive.zip", "executable": false, "extract": true, "cache": true }
],
"dependencies": ["/product/db/mongo", "/product/db", "../../db"],
"healthChecks": [
{
"protocol": "HTTP",
"path": "/health",
"gracePeriodSeconds": 3,
"intervalSeconds": 10,
"portIndex": 0,
"timeoutSeconds": 10,
"maxConsecutiveFailures": 3
},
{
"protocol": "HTTPS",
"path": "/machinehealth",
"gracePeriodSeconds": 3,
"intervalSeconds": 10,
"port": 3333,
"timeoutSeconds": 10,
"maxConsecutiveFailures": 3
},
{
"protocol": "TCP",
"gracePeriodSeconds": 3,
"intervalSeconds": 5,
"portIndex": 1,
"timeoutSeconds": 5,
"maxConsecutiveFailures": 3
},
{
"protocol": "COMMAND",
"command": { "value": "curl -f -X GET http://$HOST:$PORT0/health" },
"maxConsecutiveFailures": 3
}
],
"backoffSeconds": 1,
"backoffFactor": 1.15,
"maxLaunchDelaySeconds": 300,
"taskKillGracePeriodSeconds": 2,
"upgradeStrategy": {
"minimumHealthCapacity": 0.5,
"maximumOverCapacity": 0.2
},
"networks": [
{ "mode": "container/bridge" }
]
}