CAT segment replication

Introduced 2.7

The CAT segment replication operation returns information about active and last completed segment replication events on each replica shard, including related shard-level metrics. These metrics provide information about how far behind the primary shard the replicas are lagging.

Call the CAT Segment Replication API only on indexes with segment replication enabled.

Path and HTTP methods

  1. GET /_cat/segment_replication
  2. GET /_cat/segment_replication/<index>

Path parameters

ParameterTypeDescription
indexStringThe name of the index, or a comma-separated list or wildcard expression of index names used to filter results. If this parameter is not provided, the response contains information about all indexes in the cluster.

Query parameters

ParameterData typeDescription
active_onlyBooleanIf true, the response only includes active segment replications. Defaults to false.
detailedStringIf true, the response includes additional metrics for each stage of a segment replication event. Defaults to false.
shardsStringA comma-separated list of shards to display.
bytesByte unitsUnits used to display byte size values.
formatStringA short version of the HTTP accept header. Valid values include JSON and YAML.
hStringA comma-separated list of column names to display.
helpBooleanIf true, the response includes help information. Defaults to false.
timeTime unitsUnits used to display time values.
vBooleanIf true, the response includes column headings. Defaults to false.
sStringSpecifies to sort the results. For example, s=shardId:desc sorts by shardId in descending order.

Example requests

The following examples illustrate various segment replication responses.

No active segment replication events

The following query requests segment replication metrics with column headings for all indexes:

  1. GET /_cat/segment_replication?v=true

copy

The response contains the metrics for the preceding request:

  1. shardId target_node target_host checkpoints_behind bytes_behind current_lag last_completed_lag rejected_requests
  2. [index-1][0] runTask-1 127.0.0.1 0 0b 0s 7ms 0

Shard ID specified

The following query requests segment replication metrics with column headings for shards with the ID 0 from indexes index1 and index2:

  1. GET /_cat/segment_replication/index1,index2?v=true&shards=0

copy

The response contains the metrics for the preceding request. The column headings correspond to the metric names:

  1. shardId target_node target_host checkpoints_behind bytes_behind current_lag last_completed_lag rejected_requests
  2. [index-1][0] runTask-1 127.0.0.1 0 0b 0s 3ms 0
  3. [index-2][0] runTask-1 127.0.0.1 0 0b 0s 5ms 0

Detailed response

The following query requests detailed segment replication metrics with column headings for all indexes:

  1. GET /_cat/segment_replication?v=true&detailed=true

copy

The response contains additional metrics about the files and stages of a segment replication event:

  1. shardId target_node target_host checkpoints_behind bytes_behind current_lag last_completed_lag rejected_requests stage time files_fetched files_percent bytes_fetched bytes_percent start_time stop_time files files_total bytes bytes_total replicating_stage_time_taken get_checkpoint_info_stage_time_taken file_diff_stage_time_taken get_files_stage_time_taken finalize_replication_stage_time_taken
  2. [index-1][0] runTask-1 127.0.0.1 0 0b 0s 3ms 0 done 10ms 6 100.0% 4753 100.0% 2023-03-16T13:46:16.802Z 2023-03-16T13:46:16.812Z 6 6 4.6kb 4.6kb 0s 2ms 0s 3ms 3ms
  3. [index-2][0] runTask-1 127.0.0.1 0 0b 0s 5ms 0 done 7ms 3 100.0% 3664 100.0% 2023-03-16T13:53:33.466Z 2023-03-16T13:53:33.474Z 3 3 3.5kb 3.5kb 0s 1ms 0s 2ms 2ms

Sorting the results

The following query requests segment replication metrics with column headings for all indexes, sorted by shard ID in descending order:

  1. GET /_cat/segment_replication?v&s=shardId:desc

copy

The response contains the sorted results:

  1. shardId target_node target_host checkpoints_behind bytes_behind current_lag last_completed_lag rejected_requests
  2. [test6][1] runTask-2 127.0.0.1 0 0b 0s 5ms 0
  3. [test6][0] runTask-2 127.0.0.1 0 0b 0s 4ms 0

Using a metric alias

In a request, you can either use a metric’s full name or one of its aliases. The following query is the same as the preceding query, but it uses the alias s instead of shardID for sorting:

  1. GET /_cat/segment_replication?v&s=s:desc

copy

Example response metrics

The following table lists the response metrics that are returned for all requests. When referring to a metric in a query parameter, you can provide either the metric’s full name or any of its aliases, as shown in the previous example.

MetricAliasDescription
shardIdsThe ID of a specific shard.
target_hostthostThe target host IP address.
target_nodetnodeThe target node name.
checkpoints_behindcpbThe number of checkpoints by which the replica shard is behind the primary shard.
bytes_behindbbThe number of bytes by which the replica shard is behind the primary shard.
current_lagclagThe time elapsed while waiting for a replica shard to catch up to the primary shard.
last_completed_laglclThe time taken for a replica shard to catch up to the latest primary shard refresh.
rejected_requestsrrThe number of rejected requests for the replication group.

Additional detailed response metrics

The following table lists the additional response fields returned if detailed is set to true.

MetricAliasDescription
stagestThe current stage of a segment replication event.
timet, tiThe amount of time a segment replication event took to complete, in milliseconds.
files_fetchedffThe number of files fetched so far for a segment replication event.
files_percentfpThe percentage of files fetched so far for a segment replication event.
bytes_fetchedbfThe number of bytes fetched so far for a segment replication event.
bytes_percentbpThe number of bytes fetched so far for a segment replication event as a percentage.
start_timestartThe segment replication start time.
stop_timestopThe segment replication stop time.
filesfThe number of files that needs to be fetched for a segment replication event.
files_totaltfThe total number of files that are part of this recovery, including both reused and recovered files.
bytesbThe number of bytes that needs to be fetched for a segment replication event.
bytes_totaltbThe total number of bytes in the shard.
replicating_stage_time_takenrsttThe amount of time the replicating stage of a segment replication event took to complete.
get_checkpoint_info_stage_time_takengcisttThe amount of time the get checkpoint info stage of a segment replication event took to complete.
file_diff_stage_time_takenfdsttThe amount of time the file diff stage of a segment replication event took to complete.
get_files_stage_time_takengfsttThe amount of time the get files stage of a segment replication event took to complete.
finalize_replication_stage_time_takenfrsttThe amount of time the finalize replication stage of a segment replication event took to complete.