Memory tuning guide
In addition to the main memory setup guide, this section explains how to setup memory of task executorsdepending on the use case and which options are important in which case.
Configure memory for standalone deployment
It is recommended to configure total Flink memory(taskmanager.memory.flink.size
) or its componentsfor standalone deployment where you want to declare how much memory is given to Flink itself.Additionally, you can adjust JVM metaspace if it causes problems.
The total Process memory is not relevant because JVM overhead is not controlled by Flink or deployment environment,only physical resources of the executing machine matter in this case.
Configure memory for containers
It is recommended to configure total process memory(taskmanager.memory.process.size
) for the containerized deployments(Kubernetes, Yarn or Mesos).It declares how much memory in total should be assigned to the Flink JVM process and corresponds to the size of the requested container.
Note If you configure the total Flink memory Flink will implicitly add JVM memory componentsto derive the total process memory and request a container with the memory of that derived size,see also detailed Memory Model.
Warning: If Flink or user code allocates unmanaged off-heap (native) memory beyond the container size the job can fail because the deployment environment can kill the offending containers.
See also description of container memory exceeded failure.
Configure memory for state backends
When deploying a Flink streaming application, the type of state backend usedwill dictate the optimal memory configurations of your cluster.
Heap state backend
When running a stateless job or using a heap state backend (MemoryStateBackendor FsStateBackend, set managed memory to zero.This will ensure that the maximum amount of memory is allocated for user code on the JVM.
RocksDB state backend
The RocksDBStateBackend uses native memory. By default,RocksDB is setup to limit native memory allocation to the size of the managed memory.Therefore, it is important to reserve enough managed memory for your state use case. If you disable the default RocksDB memory control,task executors can be killed in containerized deployments if RocksDB allocates memory above the limit of the requested container size(the total process memory).See also how to tune RocksDB memoryand state.backend.rocksdb.memory.managed.
Configure memory for batch jobs
Flink’s batch operators leverage managed memory to run more efficiently.In doing so, some operations can be performed directly on raw data without having to be deserialized into Java objects.This means that managed memory configurations have practical effectson the performance of your applications. Flink will attempt to allocate and use as much managed memoryas configured for batch jobs but not go beyond its limits. This prevents OutOfMemoryError
’s because Flink knows preciselyhow much memory it has to leverage. If the managed memory is not sufficient,Flink will gracefully spill to disk.