Cross-cluster search

You can use cross-cluster search (CCS) in OpenSearch to search and analyze data across multiple clusters, enabling you to gain insights from distributed data sources. Cross-cluster search is available by default with the Security plugin, but you need to configure each cluster to allow remote connections from other clusters. This involves setting up remote cluster connections and configuring access permissions.



Authentication flow

The following sequence describes the authentication flow when using cross-cluster search to access a remote cluster from a coordinating cluster. You can have different authentication and authorization configurations on the remote and coordinating clusters, but we recommend using the same settings on both.

  1. The Security plugin authenticates the user on the coordinating cluster.
  2. The Security plugin fetches the user’s backend roles on the coordinating cluster.
  3. The call, including the authenticated user, is forwarded to the remote cluster.
  4. The user’s permissions are evaluated on the remote cluster.

Setting permissions

To query indexes on remote clusters, users must have READ or SEARCH permissions. Furthermore, when the search request includes the query parameter ccs_minimize_roundtrips=false—which tells OpenSearch not to minimize outgoing and incoming requests to remote clusters—users need to have the following additional index permission:

  1. indices:admin/shards/search_shards

For more information about the ccs_minimize_roundtrips parameter, see the list of parameters for the Search API.

Example roles.yml configuration

  1. humanresources:
  2. cluster:
  3. - CLUSTER_COMPOSITE_OPS_RO
  4. indices:
  5. 'humanresources':
  6. '*':
  7. - READ
  8. - indices:admin/shards/search_shards # needed when the search request includes parameter setting 'ccs_minimize_roundtrips=false'.

Example role in OpenSearch Dashboards

OpenSearch Dashboards UI for creating a cross-cluster search role

Sample Docker setup

To define Docker permissions, save the following sample file as docker-compose.yml and run docker-compose up to start two single-node clusters on the same network:

  1. version: '3'
  2. services:
  3. opensearch-ccs-node1:
  4. image: opensearchproject/opensearch:2.18.0
  5. container_name: opensearch-ccs-node1
  6. environment:
  7. - cluster.name=opensearch-ccs-cluster1
  8. - discovery.type=single-node
  9. - bootstrap.memory_lock=true # along with the memlock settings below, disables swapping
  10. - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
  11. - "OPENSEARCH_INITIAL_ADMIN_PASSWORD=<custom-admin-password>" # The initial admin password used by the demo configuration
  12. ulimits:
  13. memlock:
  14. soft: -1
  15. hard: -1
  16. volumes:
  17. - opensearch-data1:/usr/share/opensearch/data
  18. ports:
  19. - 9200:9200
  20. - 9600:9600 # required for Performance Analyzer
  21. networks:
  22. - opensearch-net
  23. opensearch-ccs-node2:
  24. image: opensearchproject/opensearch:2.18.0
  25. container_name: opensearch-ccs-node2
  26. environment:
  27. - cluster.name=opensearch-ccs-cluster2
  28. - discovery.type=single-node
  29. - bootstrap.memory_lock=true # along with the memlock settings below, disables swapping
  30. - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
  31. - "OPENSEARCH_INITIAL_ADMIN_PASSWORD=<custom-admin-password>" # The initial admin password used by the demo configuration
  32. ulimits:
  33. memlock:
  34. soft: -1
  35. hard: -1
  36. volumes:
  37. - opensearch-data2:/usr/share/opensearch/data
  38. ports:
  39. - 9250:9200
  40. - 9700:9600 # required for Performance Analyzer
  41. networks:
  42. - opensearch-net
  43. volumes:
  44. opensearch-data1:
  45. opensearch-data2:
  46. networks:
  47. opensearch-net:

After the clusters start, verify the names of each cluster using the following commands:

  1. curl -XGET -u 'admin:<custom-admin-password>' -k 'https://localhost:9200'
  2. {
  3. "cluster_name" : "opensearch-ccs-cluster1",
  4. ...
  5. }
  6. curl -XGET -u 'admin:<custom-admin-password>' -k 'https://localhost:9250'
  7. {
  8. "cluster_name" : "opensearch-ccs-cluster2",
  9. ...
  10. }

Both clusters run on localhost, so the important identifier is the port number. In this case, use port 9200 (opensearch-ccs-node1) as the remote cluster, and port 9250 (opensearch-ccs-node2) as the coordinating cluster.

To get the IP address for the remote cluster, first identify its container ID:

  1. docker ps
  2. CONTAINER ID IMAGE PORTS NAMES
  3. 6fe89ebc5a8e opensearchproject/opensearch:2.18.0 0.0.0.0:9200->9200/tcp, 0.0.0.0:9600->9600/tcp, 9300/tcp opensearch-ccs-node1
  4. 2da08b6c54d8 opensearchproject/opensearch:2.18.0 9300/tcp, 0.0.0.0:9250->9200/tcp, 0.0.0.0:9700->9600/tcp opensearch-ccs-node2

Then get that container’s IP address:

  1. docker inspect --format='{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' 6fe89ebc5a8e
  2. 172.31.0.3

On the coordinating cluster, add the remote cluster name and the IP address (with port 9300) for each “seed node.” In this case, you only have one seed node:

  1. curl -k -XPUT -H 'Content-Type: application/json' -u 'admin:<custom-admin-password>' 'https://localhost:9250/_cluster/settings' -d '
  2. {
  3. "persistent": {
  4. "cluster.remote": {
  5. "opensearch-ccs-cluster1": {
  6. "seeds": ["172.31.0.3:9300"]
  7. }
  8. }
  9. }
  10. }'

All of the cURL requests can also be sent using OpenSearch Dashboards Dev Tools.

The following image shows an example of a cURL request using Dev Tools. OpenSearch Dashboards UI for configuring remote cluster for Cross-cluster search

On the remote cluster, index a document:

  1. curl -XPUT -k -H 'Content-Type: application/json' -u 'admin:<custom-admin-password>' 'https://localhost:9200/books/_doc/1' -d '{"Dracula": "Bram Stoker"}'

At this point, cross-cluster search works. You can test it using the admin user:

  1. curl -XGET -k -u 'admin:<custom-admin-password>' 'https://localhost:9250/opensearch-ccs-cluster1:books/_search?pretty'
  2. {
  3. ...
  4. "hits": [{
  5. "_index": "opensearch-ccs-cluster1:books",
  6. "_id": "1",
  7. "_score": 1.0,
  8. "_source": {
  9. "Dracula": "Bram Stoker"
  10. }
  11. }]
  12. }

To continue testing, create a new user on both clusters:

  1. curl -XPUT -k -u 'admin:<custom-admin-password>' 'https://localhost:9200/_plugins/_security/api/internalusers/booksuser' -H 'Content-Type: application/json' -d '{"password":"password"}'
  2. curl -XPUT -k -u 'admin:<custom-admin-password>' 'https://localhost:9250/_plugins/_security/api/internalusers/booksuser' -H 'Content-Type: application/json' -d '{"password":"password"}'

Then run the same search as before with booksuser:

  1. curl -XGET -k -u booksuser:password 'https://localhost:9250/opensearch-ccs-cluster1:books/_search?pretty'
  2. {
  3. "error" : {
  4. "root_cause" : [
  5. {
  6. "type" : "security_exception",
  7. "reason" : "no permissions for [indices:admin/shards/search_shards, indices:data/read/search] and User [name=booksuser, roles=[], requestedTenant=null]"
  8. }
  9. ],
  10. "type" : "security_exception",
  11. "reason" : "no permissions for [indices:admin/shards/search_shards, indices:data/read/search] and User [name=booksuser, roles=[], requestedTenant=null]"
  12. },
  13. "status" : 403
  14. }

Note the permissions error. On the remote cluster, create a role with the appropriate permissions, and map booksuser to that role:

  1. curl -XPUT -k -u 'admin:<custom-admin-password>' -H 'Content-Type: application/json' 'https://localhost:9200/_plugins/_security/api/roles/booksrole' -d '{"index_permissions":[{"index_patterns":["books"],"allowed_actions":["indices:admin/shards/search_shards","indices:data/read/search"]}]}'
  2. curl -XPUT -k -u 'admin:<custom-admin-password>' -H 'Content-Type: application/json' 'https://localhost:9200/_plugins/_security/api/rolesmapping/booksrole' -d '{"users" : ["booksuser"]}'

Both clusters must have the user role, but only the remote cluster needs both the role and mapping. In this case, the coordinating cluster handles authentication (that is, “Does this request include valid user credentials?”), and the remote cluster handles authorization (that is, “Can this user access this data?”).

Finally, repeat the search:

  1. curl -XGET -k -u booksuser:password 'https://localhost:9250/opensearch-ccs-cluster1:books/_search?pretty'
  2. {
  3. ...
  4. "hits": [{
  5. "_index": "opensearch-ccs-cluster1:books",
  6. "_id": "1",
  7. "_score": 1.0,
  8. "_source": {
  9. "Dracula": "Bram Stoker"
  10. }
  11. }]
  12. }

Sample bare metal/virtual machine setup

If you are running OpenSearch on a bare metal server or using a virtual machine, you can run the same commands, specifying the IP (or domain) of the OpenSearch cluster. For example, in order to configure a remote cluster for cross-cluster search, find the IP of the remote node or domain of the remote cluster and run the following command:

  1. curl -k -XPUT -H 'Content-Type: application/json' -u 'admin:<custom-admin-password>' 'https://opensearch-domain-1:9200/_cluster/settings' -d '
  2. {
  3. "persistent": {
  4. "cluster.remote": {
  5. "opensearch-ccs-cluster2": {
  6. "seeds": ["opensearch-domain-2:9300"]
  7. }
  8. }
  9. }
  10. }'

It is sufficient to point to only one of the node IPs on the remote cluster because all nodes in the cluster will be queried as part of the node discovery process.

You can now run queries across both clusters:

  1. curl -XGET -k -u 'admin:<custom-admin-password>' 'https://opensearch-domain-1:9200/opensearch-ccs-cluster2:books/_search?pretty'
  2. {
  3. ...
  4. "hits": [{
  5. "_index": "opensearch-ccs-cluster2:books",
  6. "_id": "1",
  7. "_score": 1.0,
  8. "_source": {
  9. "Dracula": "Bram Stoker"
  10. }
  11. }]
  12. }

Sample Kubernetes/Helm setup

If you are using Kubernetes clusters to deploy OpenSearch, you need to configure the remote cluster using either the LoadBalancer or Ingress. The Kubernetes services created using the following Helm example are of the ClusterIP type and are only accessible from within the cluster; therefore, you must use an externally accessible endpoint:

  1. curl -k -XPUT -H 'Content-Type: application/json' -u 'admin:<custom-admin-password>' 'https://opensearch-domain-1:9200/_cluster/settings' -d '
  2. {
  3. "persistent": {
  4. "cluster.remote": {
  5. "opensearch-ccs-cluster2": {
  6. "seeds": ["ingress:9300"]
  7. }
  8. }
  9. }
  10. }'

Proxy settings

You can configure cross-cluster search on a cluster running behind a proxy. There are many ways to configure a reverse proxy and various proxies to choose from. The following example demonstrates the basic NGINX reverse proxy configuration without TLS termination, though there are many proxies and reverse proxies to choose from. For this example to work, OpenSearch must have both transport and HTTP TLS encryption enabled. For more information about configuring TLS encryption, see Configuring TLS certificates.

Prerequisites

To use proxy mode, fulfill the following prerequisites:

  • Make sure that the source cluster’s nodes are able to connect to the configured proxy_address.
  • Make sure that the proxy can route connections to the remote cluster nodes.

Proxy configuration

The following is the basic NGINX configuration for HTTP and transport communication:

  1. stream {
  2. upstream opensearch-transport {
  3. server <opensearch>:9300;
  4. }
  5. upstream opensearch-http {
  6. server <opensearch>:9200;
  7. }
  8. server {
  9. listen 8300;
  10. ssl_certificate /.../2.18.0/config/esnode.pem;
  11. ssl_certificate_key /.../2.18.0/config/esnode-key.pem;
  12. ssl_trusted_certificate /.../2.18.0/config/root-ca.pem;
  13. proxy_pass opensearch-transport;
  14. ssl_preread on;
  15. }
  16. server {
  17. listen 443;
  18. listen [::]:443;
  19. ssl_certificate /.../2.18.0/config/esnode.pem;
  20. ssl_certificate_key /.../2.18.0/config/esnode-key.pem;
  21. ssl_trusted_certificate /.../2.18.0/config/root-ca.pem;
  22. proxy_pass opensearch-http;
  23. ssl_preread on;
  24. }
  25. }

The listening ports for HTTP and transport communication are set to 443 and 8300, respectively.

OpenSearch configuration

The remote cluster can be configured to point to the proxy by using the following command:

  1. curl -k -XPUT -H 'Content-Type: application/json' -u 'admin:<custom-admin-password>' 'https://opensearch:9200/_cluster/settings' -d '
  2. {
  3. "persistent": {
  4. "cluster.remote": {
  5. "opensearch-remote-cluster": {
  6. "mode": "proxy",
  7. "proxy_address": "<remote-cluster-proxy>:8300"
  8. }
  9. }
  10. }
  11. }'

Note the previously configured port 8300 in the Proxy configuration section.