Reindex document

Introduced 1.0

The reindex document API operation lets you copy all or a subset of your data from a source index into a destination index.

Example

  1. POST /_reindex
  2. {
  3. "source":{
  4. "index":"my-source-index"
  5. },
  6. "dest":{
  7. "index":"my-destination-index"
  8. }
  9. }

copy

Path and HTTP methods

  1. POST /_reindex

URL parameters

All URL parameters are optional.

ParameterTypeDescription
refreshBooleanIf true, OpenSearch refreshes shards to make the reindex operation available to search results. Valid options are true, false, and wait_for, which tells OpenSearch to wait for a refresh before executing the operation. Default is false.
timeoutTimeHow long to wait for a response from the cluster. Default is 30s.
wait_for_active_shardsStringThe number of active shards that must be available before OpenSearch processes the reindex request. Default is 1 (only the primary shard). Set to all or a positive integer. Values greater than 1 require replicas. For example, if you specify a value of 3, the index must have two replicas distributed across two additional nodes for the operation to succeed.
wait_for_completionBooleanWaits for the matching tasks to complete. Default is false.
requests_per_secondIntegerSpecifies the request’s throttling in sub-requests per second. Default is -1, which means no throttling.
require_aliasBooleanWhether the destination index must be an index alias. Default is false.
scrollTimeHow long to keep the search context open. Default is 5m.
slicesIntegerNumber of sub-tasks OpenSearch should divide this task into. Default is 1, which means OpenSearch should not divide this task. Setting this parameter to auto indicates to OpenSearch that it should automatically decide how many slices to split the task into.
max_docsIntegerHow many documents the update by query operation should process at most. Default is all documents.

Request body

Your request body must contain the names of the source index and destination index. All other fields are optional.

FieldDescription
conflictsIndicates to OpenSearch what should happen if the Reindex operation runs into a version conflict. Valid options are abort and proceed. Default is abort.
sourceInformation about the source index to include. Valid fields are index, max_docs, query, remote, size, slice, and _source.
indexThe name of the source index to copy data from.
max_docsThe maximum number of documents to reindex.
queryThe search query to use for the reindex operation.
remoteInformation about a remote OpenSearch cluster to copy data from. Valid fields are host, username, password, socket_timeout, and connect_timeout.
hostHost URL of the OpenSearch cluster to copy data from.
usernameUsername to authenticate with the remote cluster.
passwordPassword to authenticate with the remote cluster.
socket_timeoutThe wait time for socket reads. Default is 30s.
connect_timeoutThe wait time for remote connection timeouts. Default is 30s.
sizeThe number of documents to reindex.
sliceWhether to manually or automatically slice the reindex operation so it executes in parallel. Setting this field to auto allows OpenSearch to control the number of slices to use, which is one slice per shard, up to a maximum of 20. If there are multiple sources, the number of slices used are based on the index or backing index with the smallest number of shards.
_sourceWhether to reindex source fields. Specify a list of fields to reindex or true to reindex all fields. Default is true.
idThe ID to associate with manual slicing.
maxMaximum number of slices.
destInformation about the destination index. Valid values are index, version_type, op_type, and pipeline.
indexName of the destination index.
version_typeThe indexing operation’s version type. Valid values are internal, external, external_gt (retrieve the document if the specified version number is greater than the document’s current version), and external_gte (retrieve the document if the specified version number is greater or equal to than the document’s current version).
op_typeWhether to copy over documents that are missing in the destination index. Valid values are create (ignore documents with the same ID from the source index) and index (copy everything from the source index).
pipelineWhich ingest pipeline to utilize during the reindex.
scriptA script that OpenSearch uses to apply transformations to the data during the reindex operation.
sourceThe actual script that OpenSearch runs.
langThe scripting language. Valid options are painless, expression, mustache, and java.

Response

  1. {
  2. "took": 28829,
  3. "timed_out": false,
  4. "total": 111396,
  5. "updated": 0,
  6. "created": 111396,
  7. "deleted": 0,
  8. "batches": 112,
  9. "version_conflicts": 0,
  10. "noops": 0,
  11. "retries": {
  12. "bulk": 0,
  13. "search": 0
  14. },
  15. "throttled_millis": 0,
  16. "requests_per_second": -1.0,
  17. "throttled_until_millis": 0,
  18. "failures": []
  19. }

Response body fields

FieldDescription
tookHow long the operation took in milliseconds.
timed_outWhether the operation timed out.
totalThe total number of documents processed.
updatedThe number of documents updated in the destination index.
createdThe number of documents created in the destination index.
deletedThe number of documents deleted.
batchesNumber of scroll responses.
version_conflictsNumber of version conflicts.
noopsHow many documents OpenSearch ignored during the operation.
retriesNumber of bulk and search retry requests.
throttled_millisNumber of throttled milliseconds during the request.
requests_per_secondNumber of requests executed per second during the operation.
throttled_until_millisThe amount of time until OpenSearch executes the next throttled request.
failuresAny failures that occurred during the operation.