Table API & SQL
Apache Flink features two relational APIs - the Table API and SQL - for unified stream and batch processing. The Table API is a language-integrated query API for Java, Scala, and Python that allows the composition of queries from relational operators such as selection, filter, and join in a very intuitive way. Flink’s SQL support is based on Apache Calcite which implements the SQL standard. Queries specified in either interface have the same semantics and specify the same result regardless of whether the input is continuous (streaming) or bounded (batch).
The Table API and SQL interfaces integrate seamlessly with each other and Flink’s DataStream API. You can easily switch between all APIs and libraries which build upon them. For instance, you can detect patterns from a table using MATCH_RECOGNIZE
clause and later use the DataStream API to build alerting based on the matched patterns.
Table Planners
Table planners are responsible for translating relational operators into an executable, optimized Flink job. Flink supports two different planner implementations; the modern Blink planner and the legacy planner. For production use cases, we recommend the Blink planner which has been the default planner since 1.11. See the common API page for more information on how to switch between the two planners.
Table Program Dependencies
Depending on the target programming language, you need to add the Java or Scala API to a project in order to use the Table API & SQL for defining pipelines.
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-api-java-bridge_2.11</artifactId>
<version>1.12.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-api-scala-bridge_2.11</artifactId>
<version>1.12.0</version>
<scope>provided</scope>
</dependency>
$ python -m pip install apache-flink 1.12.0
Additionally, if you want to run the Table API & SQL programs locally within your IDE, you must add the following set of modules, depending which planner you want to use.
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-planner-blink_2.11</artifactId>
<version>1.12.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-scala_2.11</artifactId>
<version>1.12.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-planner_2.11</artifactId>
<version>1.12.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-scala_2.11</artifactId>
<version>1.12.0</version>
<scope>provided</scope>
</dependency>
Extension Dependencies
If you want to implement a custom format or connector for (de)serializing rows or a set of user-defined functions, the following dependency is sufficient and can be used for JAR files for the SQL Client:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table-common</artifactId>
<version>1.12.0</version>
<scope>provided</scope>
</dependency>
Where to go next?
- Concepts & Common API: Shared concepts and APIs of the Table API and SQL.
- Data Types: Lists pre-defined data types and their properties.
- Streaming Concepts: Streaming-specific documentation for the Table API or SQL such as configuration of time attributes and handling of updating results.
- Connect to External Systems: Available connectors and formats for reading and writing data to external systems.
- Table API: Supported operations and API for the Table API.
- SQL: Supported operations and syntax for SQL.
- Built-in Functions: Supported functions in Table API and SQL.
- SQL Client: Play around with Flink SQL and submit a table program to a cluster without programming knowledge.