State Processor API

Apache Flink’s State Processor API provides powerful functionality to reading, writing, and modifing savepoints and checkpoints using Flink’s batch DataSet API. Due to the interoperability of DataSet and Table API, you can even use relational Table API or SQL queries to analyze and process state data.

For example, you can take a savepoint of a running stream processing application and analyze it with a DataSet batch program to verify that the application behaves correctly. Or you can read a batch of data from any store, preprocess it, and write the result to a savepoint that you use to bootstrap the state of a streaming application. It is also possible to fix inconsistent state entries. Finally, the State Processor API opens up many ways to evolve a stateful application that was previously blocked by parameter and design choices that could not be changed without losing all the state of the application after it was started. For example, you can now arbitrarily modify the data types of states, adjust the maximum parallelism of operators, split or merge operator state, re-assign operator UIDs, and so on.

To get started with the state processor api, include the following library in your application.

  1. <dependency>
  2. <groupId>org.apache.flink</groupId>
  3. <artifactId>flink-state-processor-api_2.11</artifactId>
  4. <version>1.13.0</version>
  5. </dependency>

Copied to clipboard!

Mapping Application State to DataSets

The State Processor API maps the state of a streaming application to one or more data sets that can be processed separately. In order to be able to use the API, you need to understand how this mapping works.

But let us first have a look at what a stateful Flink job looks like. A Flink job is composed of operators; typically one or more source operators, a few operators for the actual processing, and one or more sink operators. Each operator runs in parallel in one or more tasks and can work with different types of state. An operator can have zero, one, or more “operator states” which are organized as lists that are scoped to the operator’s tasks. If the operator is applied on a keyed stream, it can also have zero, one, or more “keyed states” which are scoped to a key that is extracted from each processed record. You can think of keyed state as a distributed key-value map.

The following figure shows the application “MyApp” which consists of three operators called “Src”, “Proc”, and “Snk”. Src has one operator state (os1), Proc has one operator state (os2) and two keyed states (ks1, ks2) and Snk is stateless.

Application: MyApp

A savepoint or checkpoint of MyApp consists of the data of all states, organized in a way that the states of each task can be restored. When processing the data of a savepoint (or checkpoint) with a batch job, we need a mental model that maps the data of the individual tasks’ states into data sets or tables. In fact, we can think of a savepoint as a database. Every operator (identified by its UID) represents a namespace. Each operator state of an operator is mapped to a dedicated table in the namespace with a single column that holds the state’s data of all tasks. All keyed states of an operator are mapped to a single table consisting of a column for the key, and one column for each keyed state. The following figure shows how a savepoint of MyApp is mapped to a database.

Database: MyApp

The figure shows how the values of Src’s operator state are mapped to a table with one column and five rows, one row for each of the list entries across all parallel tasks of Src. Operator state os2 of the operator “Proc” is similarly mapped to an individual table. The keyed states ks1 and ks2 are combined to a single table with three columns, one for the key, one for ks1 and one for ks2. The keyed table holds one row for each distinct key of both keyed states. Since the operator “Snk” does not have any state, its namespace is empty.

Reading State

Reading state begins by specifying the path to a valid savepoint or checkpoint along with the StateBackend that should be used to restore the data. The compatibility guarantees for restoring state are identical to those when restoring a DataStream application.

  1. ExecutionEnvironment bEnv = ExecutionEnvironment.getExecutionEnvironment();
  2. ExistingSavepoint savepoint = Savepoint.load(bEnv, "hdfs://path/", new MemoryStateBackend());

Operator State

Operator state is any non-keyed state in Flink. This includes, but is not limited to, any use of CheckpointedFunction or BroadcastState within an application. When reading operator state, users specify the operator uid, the state name, and the type information.

Operator List State

Operator state stored in a CheckpointedFunction using getListState can be read using ExistingSavepoint#readListState. The state name and type information should match those used to define the ListStateDescriptor that declared this state in the DataStream application.

  1. DataSet<Integer> listState = savepoint.readListState<>(
  2. "my-uid",
  3. "list-state",
  4. Types.INT);

Operator Union List State

Operator state stored in a CheckpointedFunction using getUnionListState can be read using ExistingSavepoint#readUnionState. The state name and type information should match those used to define the ListStateDescriptor that declared this state in the DataStream application. The framework will return a single copy of the state, equivalent to restoring a DataStream with parallelism 1.

  1. DataSet<Integer> listState = savepoint.readUnionState<>(
  2. "my-uid",
  3. "union-state",
  4. Types.INT);

Broadcast State

BroadcastState can be read using ExistingSavepoint#readBroadcastState. The state name and type information should match those used to define the MapStateDescriptor that declared this state in the DataStream application. The framework will return a single copy of the state, equivalent to restoring a DataStream with parallelism 1.

  1. DataSet<Tuple2<Integer, Integer>> broadcastState = savepoint.readBroadcastState<>(
  2. "my-uid",
  3. "broadcast-state",
  4. Types.INT,
  5. Types.INT);

Using Custom Serializers

Each of the operator state readers support using custom TypeSerializers if one was used to define the StateDescriptor that wrote out the state.

  1. DataSet<Integer> listState = savepoint.readListState<>(
  2. "uid",
  3. "list-state",
  4. Types.INT,
  5. new MyCustomIntSerializer());

Keyed State

Keyed state, or partitioned state, is any state that is partitioned relative to a key. When reading a keyed state, users specify the operator id and a KeyedStateReaderFunction<KeyType, OutputType>.

The KeyedStateReaderFunction allows users to read arbitrary columns and complex state types such as ListState, MapState, and AggregatingState. This means if an operator contains a stateful process function such as:

  1. public class StatefulFunctionWithTime extends KeyedProcessFunction<Integer, Integer, Void> {
  2. ValueState<Integer> state;
  3. ListState<Long> updateTimes;
  4. @Override
  5. public void open(Configuration parameters) {
  6. ValueStateDescriptor<Integer> stateDescriptor = new ValueStateDescriptor<>("state", Types.INT);
  7. state = getRuntimeContext().getState(stateDescriptor);
  8. ListStateDescriptor<Long> updateDescriptor = new ListStateDescriptor<>("times", Types.LONG);
  9. updateTimes = getRuntimeContext().getListState(updateDescriptor);
  10. }
  11. @Override
  12. public void processElement(Integer value, Context ctx, Collector<Void> out) throws Exception {
  13. state.update(value + 1);
  14. updateTimes.add(System.currentTimeMillis());
  15. }
  16. }

Then it can read by defining an output type and corresponding KeyedStateReaderFunction.

  1. DataSet<KeyedState> keyedState = savepoint.readKeyedState("my-uid", new ReaderFunction());
  2. public class KeyedState {
  3. public int key;
  4. public int value;
  5. public List<Long> times;
  6. }
  7. public class ReaderFunction extends KeyedStateReaderFunction<Integer, KeyedState> {
  8. ValueState<Integer> state;
  9. ListState<Long> updateTimes;
  10. @Override
  11. public void open(Configuration parameters) {
  12. ValueStateDescriptor<Integer> stateDescriptor = new ValueStateDescriptor<>("state", Types.INT);
  13. state = getRuntimeContext().getState(stateDescriptor);
  14. ListStateDescriptor<Long> updateDescriptor = new ListStateDescriptor<>("times", Types.LONG);
  15. updateTimes = getRuntimeContext().getListState(updateDescriptor);
  16. }
  17. @Override
  18. public void readKey(
  19. Integer key,
  20. Context ctx,
  21. Collector<KeyedState> out) throws Exception {
  22. KeyedState data = new KeyedState();
  23. data.key = key;
  24. data.value = state.value();
  25. data.times = StreamSupport
  26. .stream(updateTimes.get().spliterator(), false)
  27. .collect(Collectors.toList());
  28. out.collect(data);
  29. }
  30. }

Along with reading registered state values, each key has access to a Context with metadata such as registered event time and processing time timers.

{% panel Note: When using a KeyedStateReaderFunction, all state descriptors must be registered eagerly inside of open. Any attempt to call a RuntimeContext#get*State will result in a RuntimeException. %}

Window State

The state processor api supports reading state from a window operator. When reading a window state, users specify the operator id, window assigner, and aggregation type.

Additionally, a WindowReaderFunction can be specified to enrich each read with additional information similar to a WindowFunction or ProcessWindowFunction.

Suppose a DataStream application that counts the number of clicks per user per minute.

  1. class Click {
  2. public String userId;
  3. public LocalDateTime time;
  4. }
  5. class ClickCounter implements AggregateFunction<Click, Integer, Integer> {
  6. @Override
  7. public Integer createAccumulator() {
  8. return 0;
  9. }
  10. @Override
  11. public Integer add(Click value, Integer accumulator) {
  12. return 1 + accumulator;
  13. }
  14. @Override
  15. public Integer getResult(Integer accumulator) {
  16. return accumulator;
  17. }
  18. @Override
  19. public Integer merge(Integer a, Integer b) {
  20. return a + b;
  21. }
  22. }
  23. DataStream<Click> clicks = . . .
  24. clicks
  25. .keyBy(click -> click.userId)
  26. .window(TumblingEventTimeWindows.of(Time.minutes(1)))
  27. .aggregate(new ClickCounter())
  28. .uid("click-window")
  29. .addSink(new Sink());

This state can be read using the code below.

  1. class ClickState {
  2. public String userId;
  3. public int count;
  4. public TimeWindow window;
  5. public Set<Long> triggerTimers;
  6. }
  7. class ClickReader extends WindowReaderFunction<Integer, ClickState, String, TimeWindow> {
  8. @Override
  9. public void readWindow(String key, Context<TimeWindow> context, Iterable<Integer> elements, Collector<ClickState> out) {
  10. ClickState state = new ClickState();
  11. state.userId = key;
  12. state.count = elements.iterator().next();
  13. state.window = context.window();
  14. state.triggerTimers = context.registeredEventTimeTimers();
  15. out.collect(state);
  16. }
  17. }
  18. ExecutionEnvironment batchEnv = ExecutionEnvironment.getExecutionEnvironment();
  19. ExistingSavepoint savepoint = Savepoint.load(batchEnv, "hdfs://checkpoint-dir", new MemoryStateBackend());
  20. savepoint
  21. .window(TumblingEventTimeWindows.of(Time.minutes(1)))
  22. .aggregate("click-window", new ClickCounter(), new ClickReader(), Types.String, Types.INT, Types.INT)
  23. .print();

Additionally, trigger state - from CountTriggers or custom triggers - can be read using the method Context#triggerState inside the WindowReaderFunction.

Writing New Savepoints

Savepoint’s may also be written, which allows such use cases as bootstrapping state based on historical data. Each savepoint is made up of one or more BootstrapTransformation’s (explained below), each of which defines the state for an individual operator.

Note The state processor api does not currently provide a Scala API. As a result it will always auto-derive serializers using the Java type stack. To bootstrap a savepoint for the Scala DataStream API please manually pass in all type information.

  1. int maxParallelism = 128;
  2. Savepoint
  3. .create(new MemoryStateBackend(), maxParallelism)
  4. .withOperator("uid1", transformation1)
  5. .withOperator("uid2", transformation2)
  6. .write(savepointPath);

The UIDs associated with each operator must match one to one with the UIDs assigned to the operators in your DataStream application; these are how Flink knows what state maps to which operator.

Operator State

Simple operator state, using CheckpointedFunction, can be created using the StateBootstrapFunction.

  1. public class SimpleBootstrapFunction extends StateBootstrapFunction<Integer> {
  2. private ListState<Integer> state;
  3. @Override
  4. public void processElement(Integer value, Context ctx) throws Exception {
  5. state.add(value);
  6. }
  7. @Override
  8. public void snapshotState(FunctionSnapshotContext context) throws Exception {
  9. }
  10. @Override
  11. public void initializeState(FunctionInitializationContext context) throws Exception {
  12. state = context.getOperatorState().getListState(new ListStateDescriptor<>("state", Types.INT));
  13. }
  14. }
  15. ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
  16. DataSet<Integer> data = env.fromElements(1, 2, 3);
  17. BootstrapTransformation transformation = OperatorTransformation
  18. .bootstrapWith(data)
  19. .transform(new SimpleBootstrapFunction());

Broadcast State

BroadcastState can be written using a BroadcastStateBootstrapFunction. Similar to broadcast state in the DataStream API, the full state must fit in memory.

  1. public class CurrencyRate {
  2. public String currency;
  3. public Double rate;
  4. }
  5. public class CurrencyBootstrapFunction extends BroadcastStateBootstrapFunction<CurrencyRate> {
  6. public static final MapStateDescriptor<String, Double> descriptor =
  7. new MapStateDescriptor<>("currency-rates", Types.STRING, Types.DOUBLE);
  8. @Override
  9. public void processElement(CurrencyRate value, Context ctx) throws Exception {
  10. ctx.getBroadcastState(descriptor).put(value.currency, value.rate);
  11. }
  12. }
  13. DataSet<CurrencyRate> currencyDataSet = bEnv.fromCollection(
  14. new CurrencyRate("USD", 1.0), new CurrencyRate("EUR", 1.3));
  15. BootstrapTransformation<CurrencyRate> broadcastTransformation = OperatorTransformation
  16. .bootstrapWith(currencyDataSet)
  17. .transform(new CurrencyBootstrapFunction());

Keyed State

Keyed state for ProcessFunction’s and other RichFunction types can be written using a KeyedStateBootstrapFunction.

  1. public class Account {
  2. public int id;
  3. public double amount;
  4. public long timestamp;
  5. }
  6. public class AccountBootstrapper extends KeyedStateBootstrapFunction<Integer, Account> {
  7. ValueState<Double> state;
  8. @Override
  9. public void open(Configuration parameters) {
  10. ValueStateDescriptor<Double> descriptor = new ValueStateDescriptor<>("total",Types.DOUBLE);
  11. state = getRuntimeContext().getState(descriptor);
  12. }
  13. @Override
  14. public void processElement(Account value, Context ctx) throws Exception {
  15. state.update(value.amount);
  16. }
  17. }
  18. ExecutionEnvironment bEnv = ExecutionEnvironment.getExecutionEnvironment();
  19. DataSet<Account> accountDataSet = bEnv.fromCollection(accounts);
  20. BootstrapTransformation<Account> transformation = OperatorTransformation
  21. .bootstrapWith(accountDataSet)
  22. .keyBy(acc -> acc.id)
  23. .transform(new AccountBootstrapper());

The KeyedStateBootstrapFunction supports setting event time and processing time timers. The timers will not fire inside the bootstrap function and only become active once restored within a DataStream application. If a processing time timer is set but the state is not restored until after that time has passed, the timer will fire immediately upon start.

Attention If your bootstrap function creates timers, the state can only be restored using one of the process type functions.

Window State

The state processor api supports writing state for the window operator. When writing window state, users specify the operator id, window assigner, evictor, optional trigger, and aggregation type. It is important the configurations on the bootstrap transformation match the configurations on the DataStream window.

  1. public class Account {
  2. public int id;
  3. public double amount;
  4. public long timestamp;
  5. }
  6. ExecutionEnvironment bEnv = ExecutionEnvironment.getExecutionEnvironment();
  7. DataSet<Account> accountDataSet = bEnv.fromCollection(accounts);
  8. BootstrapTransformation<Account> transformation = OperatorTransformation
  9. .bootstrapWith(accountDataSet)
  10. // When using event time windows, it is important
  11. // to assign timestamps to each record.
  12. .assignTimestamps(account -> account.timestamp)
  13. .keyBy(acc -> acc.id)
  14. .window(TumblingEventTimeWindows.of(Time.minutes(5)))
  15. .reduce((left, right) -> left + right);

Modifying Savepoints

Besides creating a savepoint from scratch, you can base one off an existing savepoint such as when bootstrapping a single new operator for an existing job.

  1. Savepoint
  2. .load(bEnv, new MemoryStateBackend(), oldPath)
  3. .withOperator("uid", transformation)
  4. .write(newPath);