Flink 具有监控 API ,可用于查询正在运行的作业以及最近完成的作业的状态和统计信息。该监控 API 被用于 Flink 自己的仪表盘,同时也可用于自定义监控工具。
该监控 API 是 REST-ful 风格的,可以接受 HTTP 请求并返回 JSON 格式的数据。
概览
该监控 API 由作为 JobManager 一部分运行的 web 服务器提供支持。默认情况下,该服务器监听 8081 端口,端口号可以通过修改 flink-conf.yaml 文件的 rest.port 进行配置。请注意,该监控 API 的 web 服务器和仪表盘的 web 服务器目前是相同的,因此在同一端口一起运行。不过,它们响应不同的 HTTP URL 。
Returns the status for the delete operation of a cluster data set.
Path parameters
triggerid - 32-character hexadecimal string that identifies an asynchronous operation trigger ID. The ID was returned then the operation was triggered.
/datasets/:datasetid
Verb: DELETE
Response code: 202 Accepted
Triggers the deletion of a cluster data set. This async operation would return a ‘triggerid’ for further query identifier.
Path parameters
datasetid - 32-character hexadecimal string value that identifies a cluster data set.
/jars
Verb: GET
Response code: 200 OK
Returns a list of all jars previously uploaded via ‘/jars/upload’.
/jars/upload
Verb: POST
Response code: 200 OK
Uploads a jar to the cluster. The jar must be sent as multi-part data. Make sure that the “Content-Type” header is set to “application/x-java-archive”, as some http libraries do not add the header by default. Using ‘curl’ you can upload a jar via ‘curl -X POST -H “Expect:” -F “jarfile=@path/to/flink-job.jar” http://hostname:port/jars/upload‘.
/jars/:jarid
Verb: DELETE
Response code: 200 OK
Deletes a jar previously uploaded via ‘/jars/upload’.
Path parameters
jarid - String value that identifies a jar. When uploading the jar a path is returned, where the filename is the ID. This value is equivalent to the id field in the list of uploaded jars (/jars).
/jars/:jarid/plan
Verb: POST
Response code: 200 OK
Returns the dataflow plan of a job contained in a jar previously uploaded via ‘/jars/upload’. Program arguments can be passed both via the JSON request (recommended) or query parameters.
Path parameters
jarid - String value that identifies a jar. When uploading the jar a path is returned, where the filename is the ID. This value is equivalent to the id field in the list of uploaded jars (/jars).
Query parameters
program-args (optional): Deprecated, please use ‘programArg’ instead. String value that specifies the arguments for the program or plan
programArg (optional): Comma-separated list of program arguments.
entry-class (optional): String value that specifies the fully qualified name of the entry point class. Overrides the class defined in the jar file manifest.
parallelism (optional): Positive integer value that specifies the desired parallelism for the job.
/jars/:jarid/run
Verb: POST
Response code: 200 OK
Submits a job by running a jar previously uploaded via ‘/jars/upload’. Program arguments can be passed both via the JSON request (recommended) or query parameters.
Path parameters
jarid - String value that identifies a jar. When uploading the jar a path is returned, where the filename is the ID. This value is equivalent to the id field in the list of uploaded jars (/jars).
Query parameters
allowNonRestoredState (optional): Boolean value that specifies whether the job submission should be rejected if the savepoint contains state that cannot be mapped back to the job.
savepointPath (optional): String value that specifies the path of the savepoint to restore the job from.
program-args (optional): Deprecated, please use ‘programArg’ instead. String value that specifies the arguments for the program or plan
programArg (optional): Comma-separated list of program arguments.
entry-class (optional): String value that specifies the fully qualified name of the entry point class. Overrides the class defined in the jar file manifest.
parallelism (optional): Positive integer value that specifies the desired parallelism for the job.
/jobmanager/config
Verb: GET
Response code: 200 OK
Returns the cluster configuration.
/jobmanager/environment
Verb: GET
Response code: 200 OK
Returns the jobmanager environment.
/jobmanager/logs
Verb: GET
Response code: 200 OK
Returns the list of log files on the JobManager.
/jobmanager/metrics
Verb: GET
Response code: 200 OK
Provides access to job manager metrics.
Query parameters
get (optional): Comma-separated list of string values to select specific metrics.
/jobmanager/thread-dump
Verb: GET
Response code: 200 OK
Returns the thread dump of the JobManager.
/jobs
Verb: GET
Response code: 200 OK
Returns an overview over all jobs and their current state.
/jobs
Verb: POST
Response code: 202 Accepted
Submits a job. This call is primarily intended to be used by the Flink client. This call expects a multipart/form-data request that consists of file uploads for the serialized JobGraph, jars and distributed cache artifacts and an attribute named “request” for the JSON payload.
/jobs/metrics
Verb: GET
Response code: 200 OK
Provides access to aggregated job metrics.
Query parameters
get (optional): Comma-separated list of string values to select specific metrics.
agg (optional): Comma-separated list of aggregation modes which should be calculated. Available aggregations are: “min, max, sum, avg”.
jobs (optional): Comma-separated list of 32-character hexadecimal strings to select specific jobs.
/jobs/overview
Verb: GET
Response code: 200 OK
Returns an overview over all jobs.
/jobs/:jobid
Verb: GET
Response code: 200 OK
Returns details of a job.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid
Verb: PATCH
Response code: 202 Accepted
Terminates a job.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
Query parameters
mode (optional): String value that specifies the termination mode. The only supported value is: “cancel”.
/jobs/:jobid/accumulators
Verb: GET
Response code: 200 OK
Returns the accumulators for all tasks of a job, aggregated across the respective subtasks.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
Query parameters
includeSerializedValue (optional): Boolean value that specifies whether serialized user task accumulators should be included in the response.
/jobs/:jobid/checkpoints
Verb: GET
Response code: 200 OK
Returns checkpointing statistics for a job.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/checkpoints
Verb: POST
Response code: 202 Accepted
Triggers a checkpoint. This async operation would return a ‘triggerid’ for further query identifier.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/checkpoints/config
Verb: GET
Response code: 200 OK
Returns the checkpointing configuration.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/checkpoints/details/:checkpointid
Verb: GET
Response code: 200 OK
Returns details for a checkpoint.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
checkpointid - Long value that identifies a checkpoint.
Returns checkpoint statistics for a task and its subtasks.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
checkpointid - Long value that identifies a checkpoint.
vertexid - 32-character hexadecimal string value that identifies a job vertex.
/jobs/:jobid/checkpoints/:triggerid
Verb: GET
Response code: 200 OK
Returns the status of a checkpoint trigger operation.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
triggerid - 32-character hexadecimal string that identifies an asynchronous operation trigger ID. The ID was returned then the operation was triggered.
/jobs/:jobid/clientHeartbeat
Verb: PATCH
Response code: 202 Accepted
Report the jobClient’s aliveness.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/config
Verb: GET
Response code: 200 OK
Returns the configuration of a job.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/exceptions
Verb: GET
Response code: 200 OK
Returns the most recent exceptions that have been handled by Flink for this job. The ‘exceptionHistory.truncated’ flag defines whether exceptions were filtered out through the GET parameter. The backend collects only a specific amount of most recent exceptions per job. This can be configured through web.exception-history-size in the Flink configuration. The following first-level members are deprecated: ‘root-exception’, ‘timestamp’, ‘all-exceptions’, and ‘truncated’. Use the data provided through ‘exceptionHistory’, instead.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
Query parameters
maxExceptions (optional): Comma-separated list of integer values that specifies the upper limit of exceptions to return.
failureLabelFilter (optional): Collection of string values working as a filter in the form of key:value pairs allowing only exceptions with ALL of the specified failure labels to be returned.
/jobs/:jobid/execution-result
Verb: GET
Response code: 200 OK
Returns the result of a job execution. Gives access to the execution time of the job and to all accumulators created by this job.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/jobmanager/config
Verb: GET
Response code: 200 OK
Returns the jobmanager’s configuration of a specific job.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/jobmanager/environment
Verb: GET
Response code: 200 OK
Returns the jobmanager’s environment of a specific job.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/jobmanager/log-url
Verb: GET
Response code: 200 OK
Returns the log url of jobmanager of a specific job.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/metrics
Verb: GET
Response code: 200 OK
Provides access to job metrics.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
Query parameters
get (optional): Comma-separated list of string values to select specific metrics.
/jobs/:jobid/plan
Verb: GET
Response code: 200 OK
Returns the dataflow plan of a job.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/rescaling
Verb: PATCH
Response code: 200 OK
Triggers the rescaling of a job. This async operation would return a ‘triggerid’ for further query identifier.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
Query parameters
parallelism (mandatory): Positive integer value that specifies the desired parallelism.
/jobs/:jobid/rescaling/:triggerid
Verb: GET
Response code: 200 OK
Returns the status of a rescaling operation.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
triggerid - 32-character hexadecimal string that identifies an asynchronous operation trigger ID. The ID was returned then the operation was triggered.
/jobs/:jobid/resource-requirements
Verb: GET
Response code: 200 OK
Request details on the job’s resource requirements.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/resource-requirements
Verb: PUT
Response code: 200 OK
Request to update job’s resource requirements.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/savepoints
Verb: POST
Response code: 202 Accepted
Triggers a savepoint, and optionally cancels the job afterwards. This async operation would return a ‘triggerid’ for further query identifier.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/savepoints/:triggerid
Verb: GET
Response code: 200 OK
Returns the status of a savepoint operation.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
triggerid - 32-character hexadecimal string that identifies an asynchronous operation trigger ID. The ID was returned then the operation was triggered.
/jobs/:jobid/status
Verb: GET
Response code: 200 OK
Returns the current status of a job execution.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/stop
Verb: POST
Response code: 202 Accepted
Stops a job with a savepoint. Optionally, it can also emit a MAX_WATERMARK before taking the savepoint to flush out any state waiting for timers to fire. This async operation would return a ‘triggerid’ for further query identifier.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
/jobs/:jobid/taskmanagers/:taskmanagerid/log-url
Verb: GET
Response code: 200 OK
Returns the log url of jobmanager of a specific job.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
taskmanagerid - 32-character hexadecimal string that identifies a task manager.
/jobs/:jobid/vertices/:vertexid
Verb: GET
Response code: 200 OK
Returns details for a task, with a summary for each of its subtasks.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
vertexid - 32-character hexadecimal string value that identifies a job vertex.
/jobs/:jobid/vertices/:vertexid/accumulators
Verb: GET
Response code: 200 OK
Returns user-defined accumulators of a task, aggregated across all subtasks.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
vertexid - 32-character hexadecimal string value that identifies a job vertex.
/jobs/:jobid/vertices/:vertexid/backpressure
Verb: GET
Response code: 200 OK
Returns back-pressure information for a job, and may initiate back-pressure sampling if necessary.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
vertexid - 32-character hexadecimal string value that identifies a job vertex.
/jobs/:jobid/vertices/:vertexid/flamegraph
Verb: GET
Response code: 200 OK
Returns flame graph information for a vertex, and may initiate flame graph sampling if necessary.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
vertexid - 32-character hexadecimal string value that identifies a job vertex.
Query parameters
type (optional): String value that specifies the Flame Graph type. Supported options are: “[FULL, ON_CPU, OFF_CPU]”.
subtaskindex (optional): Positive integer value that identifies a subtask.
/jobs/:jobid/vertices/:vertexid/metrics
Verb: GET
Response code: 200 OK
Provides access to task metrics.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
vertexid - 32-character hexadecimal string value that identifies a job vertex.
Query parameters
get (optional): Comma-separated list of string values to select specific metrics.
jobid - 32-character hexadecimal string value that identifies a job.
vertexid - 32-character hexadecimal string value that identifies a job vertex.
subtaskindex - Positive integer value that identifies a subtask.
Query parameters
get (optional): Comma-separated list of string values to select specific metrics.
/jobs/:jobid/vertices/:vertexid/subtasktimes
Verb: GET
Response code: 200 OK
Returns time-related information for all subtasks of a task.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
vertexid - 32-character hexadecimal string value that identifies a job vertex.
/jobs/:jobid/vertices/:vertexid/taskmanagers
Verb: GET
Response code: 200 OK
Returns task information aggregated by task manager.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
vertexid - 32-character hexadecimal string value that identifies a job vertex.
/jobs/:jobid/vertices/:vertexid/watermarks
Verb: GET
Response code: 200 OK
Returns the watermarks for all subtasks of a task.
Path parameters
jobid - 32-character hexadecimal string value that identifies a job.
vertexid - 32-character hexadecimal string value that identifies a job vertex.
/overview
Verb: GET
Response code: 200 OK
Returns an overview over the Flink cluster.
/savepoint-disposal
Verb: POST
Response code: 200 OK
Triggers the desposal of a savepoint. This async operation would return a ‘triggerid’ for further query identifier.
/savepoint-disposal/:triggerid
Verb: GET
Response code: 200 OK
Returns the status of a savepoint disposal operation.
Path parameters
triggerid - 32-character hexadecimal string that identifies an asynchronous operation trigger ID. The ID was returned then the operation was triggered.
/taskmanagers
Verb: GET
Response code: 200 OK
Returns an overview over all task managers.
/taskmanagers/metrics
Verb: GET
Response code: 200 OK
Provides access to aggregated task manager metrics.
Query parameters
get (optional): Comma-separated list of string values to select specific metrics.
agg (optional): Comma-separated list of aggregation modes which should be calculated. Available aggregations are: “min, max, sum, avg”.
taskmanagers (optional): Comma-separated list of 32-character hexadecimal strings to select specific task managers.
/taskmanagers/:taskmanagerid
Verb: GET
Response code: 200 OK
Returns details for a task manager. “metrics.memorySegmentsAvailable” and “metrics.memorySegmentsTotal” are deprecated. Please use “metrics.nettyShuffleMemorySegmentsAvailable” and “metrics.nettyShuffleMemorySegmentsTotal” instead.
Path parameters
taskmanagerid - 32-character hexadecimal string that identifies a task manager.
/taskmanagers/:taskmanagerid/logs
Verb: GET
Response code: 200 OK
Returns the list of log files on a TaskManager.
Path parameters
taskmanagerid - 32-character hexadecimal string that identifies a task manager.
/taskmanagers/:taskmanagerid/metrics
Verb: GET
Response code: 200 OK
Provides access to task manager metrics.
Path parameters
taskmanagerid - 32-character hexadecimal string that identifies a task manager.
Query parameters
get (optional): Comma-separated list of string values to select specific metrics.
/taskmanagers/:taskmanagerid/thread-dump
Verb: GET
Response code: 200 OK
Returns the thread dump of the requested TaskManager.
Path parameters
taskmanagerid - 32-character hexadecimal string that identifies a task manager.