Apache NiFi Connector

This connector provides a Source and Sink that can read from and write to Apache NiFi. To use this connector, add the following dependency to your project:

  1. <dependency>
  2. <groupId>org.apache.flink</groupId>
  3. <artifactId>flink-connector-nifi_2.11</artifactId>
  4. <version>1.14.4</version>
  5. </dependency>

Copied to clipboard!

Note that the streaming connectors are currently not part of the binary distribution. See here for information about how to package the program with the libraries for cluster execution.

Installing Apache NiFi

Instructions for setting up a Apache NiFi cluster can be found here.

Apache NiFi Source

The connector provides a Source for reading data from Apache NiFi to Apache Flink.

The class NiFiSource(…) provides 2 constructors for reading data from NiFi.

  • NiFiSource(SiteToSiteConfig config) - Constructs a NiFiSource(…) given the client’s SiteToSiteConfig and a default wait time of 1000 ms.

  • NiFiSource(SiteToSiteConfig config, long waitTimeMs) - Constructs a NiFiSource(…) given the client’s SiteToSiteConfig and the specified wait time (in milliseconds).

Example:

Java

  1. StreamExecutionEnvironment streamExecEnv = StreamExecutionEnvironment.getExecutionEnvironment();
  2. SiteToSiteClientConfig clientConfig = new SiteToSiteClient.Builder()
  3. .url("http://localhost:8080/nifi")
  4. .portName("Data for Flink")
  5. .requestBatchCount(5)
  6. .buildConfig();
  7. SourceFunction<NiFiDataPacket> nifiSource = new NiFiSource(clientConfig);

Scala

  1. val streamExecEnv = StreamExecutionEnvironment.getExecutionEnvironment()
  2. val clientConfig: SiteToSiteClientConfig = new SiteToSiteClient.Builder()
  3. .url("http://localhost:8080/nifi")
  4. .portName("Data for Flink")
  5. .requestBatchCount(5)
  6. .buildConfig()
  7. val nifiSource = new NiFiSource(clientConfig)

Here data is read from the Apache NiFi Output Port called “Data for Flink” which is part of Apache NiFi Site-to-site protocol configuration.

Apache NiFi Sink

The connector provides a Sink for writing data from Apache Flink to Apache NiFi.

The class NiFiSink(…) provides a constructor for instantiating a NiFiSink.

  • NiFiSink(SiteToSiteClientConfig, NiFiDataPacketBuilder<T>) constructs a NiFiSink(…) given the client’s SiteToSiteConfig and a NiFiDataPacketBuilder that converts data from Flink to NiFiDataPacket to be ingested by NiFi.

Example:

Java

  1. StreamExecutionEnvironment streamExecEnv = StreamExecutionEnvironment.getExecutionEnvironment();
  2. SiteToSiteClientConfig clientConfig = new SiteToSiteClient.Builder()
  3. .url("http://localhost:8080/nifi")
  4. .portName("Data from Flink")
  5. .requestBatchCount(5)
  6. .buildConfig();
  7. SinkFunction<NiFiDataPacket> nifiSink = new NiFiSink<>(clientConfig, new NiFiDataPacketBuilder<T>() {...});
  8. streamExecEnv.addSink(nifiSink);

Scala

  1. val streamExecEnv = StreamExecutionEnvironment.getExecutionEnvironment()
  2. val clientConfig: SiteToSiteClientConfig = new SiteToSiteClient.Builder()
  3. .url("http://localhost:8080/nifi")
  4. .portName("Data from Flink")
  5. .requestBatchCount(5)
  6. .buildConfig()
  7. val nifiSink: NiFiSink[NiFiDataPacket] = new NiFiSink[NiFiDataPacket](clientConfig, new NiFiDataPacketBuilder<T>() {...})
  8. streamExecEnv.addSink(nifiSink)

More information about Apache NiFi Site-to-Site Protocol can be found here