Set up a data stream

Set up a data stream

To set up a data stream, follow these steps:

  1. Create an index lifecycle policy
  2. Create component templates
  3. Create an index template
  4. Create the data stream
  5. Secure the data stream

You can also convert an index alias to a data stream.

If you use Fleet, Elastic Agent, or Logstash, skip this tutorial. They all set up data streams for you.

For Fleet and Elastic Agent, check out this data streams documentation. For Logstash, check out the data streams settings for the elasticsearch output plugin.

Create an index lifecycle policy

While optional, we recommend using ILM to automate the management of your data stream’s backing indices. ILM requires an index lifecycle policy.

To create an index lifecycle policy in Kibana, open the main menu and go to Stack Management > Index Lifecycle Policies. Click Create policy.

You can also use the create lifecycle policy API.

  1. resp = client.ilm.put_lifecycle(
  2. name="my-lifecycle-policy",
  3. policy={
  4. "phases": {
  5. "hot": {
  6. "actions": {
  7. "rollover": {
  8. "max_primary_shard_size": "50gb"
  9. }
  10. }
  11. },
  12. "warm": {
  13. "min_age": "30d",
  14. "actions": {
  15. "shrink": {
  16. "number_of_shards": 1
  17. },
  18. "forcemerge": {
  19. "max_num_segments": 1
  20. }
  21. }
  22. },
  23. "cold": {
  24. "min_age": "60d",
  25. "actions": {
  26. "searchable_snapshot": {
  27. "snapshot_repository": "found-snapshots"
  28. }
  29. }
  30. },
  31. "frozen": {
  32. "min_age": "90d",
  33. "actions": {
  34. "searchable_snapshot": {
  35. "snapshot_repository": "found-snapshots"
  36. }
  37. }
  38. },
  39. "delete": {
  40. "min_age": "735d",
  41. "actions": {
  42. "delete": {}
  43. }
  44. }
  45. }
  46. },
  47. )
  48. print(resp)
  1. const response = await client.ilm.putLifecycle({
  2. name: "my-lifecycle-policy",
  3. policy: {
  4. phases: {
  5. hot: {
  6. actions: {
  7. rollover: {
  8. max_primary_shard_size: "50gb",
  9. },
  10. },
  11. },
  12. warm: {
  13. min_age: "30d",
  14. actions: {
  15. shrink: {
  16. number_of_shards: 1,
  17. },
  18. forcemerge: {
  19. max_num_segments: 1,
  20. },
  21. },
  22. },
  23. cold: {
  24. min_age: "60d",
  25. actions: {
  26. searchable_snapshot: {
  27. snapshot_repository: "found-snapshots",
  28. },
  29. },
  30. },
  31. frozen: {
  32. min_age: "90d",
  33. actions: {
  34. searchable_snapshot: {
  35. snapshot_repository: "found-snapshots",
  36. },
  37. },
  38. },
  39. delete: {
  40. min_age: "735d",
  41. actions: {
  42. delete: {},
  43. },
  44. },
  45. },
  46. },
  47. });
  48. console.log(response);
  1. PUT _ilm/policy/my-lifecycle-policy
  2. {
  3. "policy": {
  4. "phases": {
  5. "hot": {
  6. "actions": {
  7. "rollover": {
  8. "max_primary_shard_size": "50gb"
  9. }
  10. }
  11. },
  12. "warm": {
  13. "min_age": "30d",
  14. "actions": {
  15. "shrink": {
  16. "number_of_shards": 1
  17. },
  18. "forcemerge": {
  19. "max_num_segments": 1
  20. }
  21. }
  22. },
  23. "cold": {
  24. "min_age": "60d",
  25. "actions": {
  26. "searchable_snapshot": {
  27. "snapshot_repository": "found-snapshots"
  28. }
  29. }
  30. },
  31. "frozen": {
  32. "min_age": "90d",
  33. "actions": {
  34. "searchable_snapshot": {
  35. "snapshot_repository": "found-snapshots"
  36. }
  37. }
  38. },
  39. "delete": {
  40. "min_age": "735d",
  41. "actions": {
  42. "delete": {}
  43. }
  44. }
  45. }
  46. }
  47. }

Create component templates

A data stream requires a matching index template. In most cases, you compose this index template using one or more component templates. You typically use separate component templates for mappings and index settings. This lets you reuse the component templates in multiple index templates.

When creating your component templates, include:

  • A date or date_nanos mapping for the @timestamp field. If you don’t specify a mapping, Elasticsearch maps @timestamp as a date field with default options.
  • Your lifecycle policy in the index.lifecycle.name index setting.

Use the Elastic Common Schema (ECS) when mapping your fields. ECS fields integrate with several Elastic Stack features by default.

If you’re unsure how to map your fields, use runtime fields to extract fields from unstructured content at search time. For example, you can index a log message to a wildcard field and later extract IP addresses and other data from this field during a search.

To create a component template in Kibana, open the main menu and go to Stack Management > Index Management. In the Index Templates view, click Create component template.

You can also use the create component template API.

  1. resp = client.cluster.put_component_template(
  2. name="my-mappings",
  3. template={
  4. "mappings": {
  5. "properties": {
  6. "@timestamp": {
  7. "type": "date",
  8. "format": "date_optional_time||epoch_millis"
  9. },
  10. "message": {
  11. "type": "wildcard"
  12. }
  13. }
  14. }
  15. },
  16. meta={
  17. "description": "Mappings for @timestamp and message fields",
  18. "my-custom-meta-field": "More arbitrary metadata"
  19. },
  20. )
  21. print(resp)
  22. resp1 = client.cluster.put_component_template(
  23. name="my-settings",
  24. template={
  25. "settings": {
  26. "index.lifecycle.name": "my-lifecycle-policy"
  27. }
  28. },
  29. meta={
  30. "description": "Settings for ILM",
  31. "my-custom-meta-field": "More arbitrary metadata"
  32. },
  33. )
  34. print(resp1)
  1. response = client.cluster.put_component_template(
  2. name: 'my-mappings',
  3. body: {
  4. template: {
  5. mappings: {
  6. properties: {
  7. "@timestamp": {
  8. type: 'date',
  9. format: 'date_optional_time||epoch_millis'
  10. },
  11. message: {
  12. type: 'wildcard'
  13. }
  14. }
  15. }
  16. },
  17. _meta: {
  18. description: 'Mappings for @timestamp and message fields',
  19. "my-custom-meta-field": 'More arbitrary metadata'
  20. }
  21. }
  22. )
  23. puts response
  24. response = client.cluster.put_component_template(
  25. name: 'my-settings',
  26. body: {
  27. template: {
  28. settings: {
  29. 'index.lifecycle.name' => 'my-lifecycle-policy'
  30. }
  31. },
  32. _meta: {
  33. description: 'Settings for ILM',
  34. "my-custom-meta-field": 'More arbitrary metadata'
  35. }
  36. }
  37. )
  38. puts response
  1. const response = await client.cluster.putComponentTemplate({
  2. name: "my-mappings",
  3. template: {
  4. mappings: {
  5. properties: {
  6. "@timestamp": {
  7. type: "date",
  8. format: "date_optional_time||epoch_millis",
  9. },
  10. message: {
  11. type: "wildcard",
  12. },
  13. },
  14. },
  15. },
  16. _meta: {
  17. description: "Mappings for @timestamp and message fields",
  18. "my-custom-meta-field": "More arbitrary metadata",
  19. },
  20. });
  21. console.log(response);
  22. const response1 = await client.cluster.putComponentTemplate({
  23. name: "my-settings",
  24. template: {
  25. settings: {
  26. "index.lifecycle.name": "my-lifecycle-policy",
  27. },
  28. },
  29. _meta: {
  30. description: "Settings for ILM",
  31. "my-custom-meta-field": "More arbitrary metadata",
  32. },
  33. });
  34. console.log(response1);
  1. # Creates a component template for mappings
  2. PUT _component_template/my-mappings
  3. {
  4. "template": {
  5. "mappings": {
  6. "properties": {
  7. "@timestamp": {
  8. "type": "date",
  9. "format": "date_optional_time||epoch_millis"
  10. },
  11. "message": {
  12. "type": "wildcard"
  13. }
  14. }
  15. }
  16. },
  17. "_meta": {
  18. "description": "Mappings for @timestamp and message fields",
  19. "my-custom-meta-field": "More arbitrary metadata"
  20. }
  21. }
  22. # Creates a component template for index settings
  23. PUT _component_template/my-settings
  24. {
  25. "template": {
  26. "settings": {
  27. "index.lifecycle.name": "my-lifecycle-policy"
  28. }
  29. },
  30. "_meta": {
  31. "description": "Settings for ILM",
  32. "my-custom-meta-field": "More arbitrary metadata"
  33. }
  34. }

Create an index template

Use your component templates to create an index template. Specify:

  • One or more index patterns that match the data stream’s name. We recommend using our data stream naming scheme.
  • That the template is data stream enabled.
  • Any component templates that contain your mappings and index settings.
  • A priority higher than 200 to avoid collisions with built-in templates. See Avoid index pattern collisions.

To create an index template in Kibana, open the main menu and go to Stack Management > Index Management. In the Index Templates view, click Create template.

You can also use the create index template API. Include the data_stream object to enable data streams.

  1. resp = client.indices.put_index_template(
  2. name="my-index-template",
  3. index_patterns=[
  4. "my-data-stream*"
  5. ],
  6. data_stream={},
  7. composed_of=[
  8. "my-mappings",
  9. "my-settings"
  10. ],
  11. priority=500,
  12. meta={
  13. "description": "Template for my time series data",
  14. "my-custom-meta-field": "More arbitrary metadata"
  15. },
  16. )
  17. print(resp)
  1. response = client.indices.put_index_template(
  2. name: 'my-index-template',
  3. body: {
  4. index_patterns: [
  5. 'my-data-stream*'
  6. ],
  7. data_stream: {},
  8. composed_of: [
  9. 'my-mappings',
  10. 'my-settings'
  11. ],
  12. priority: 500,
  13. _meta: {
  14. description: 'Template for my time series data',
  15. "my-custom-meta-field": 'More arbitrary metadata'
  16. }
  17. }
  18. )
  19. puts response
  1. const response = await client.indices.putIndexTemplate({
  2. name: "my-index-template",
  3. index_patterns: ["my-data-stream*"],
  4. data_stream: {},
  5. composed_of: ["my-mappings", "my-settings"],
  6. priority: 500,
  7. _meta: {
  8. description: "Template for my time series data",
  9. "my-custom-meta-field": "More arbitrary metadata",
  10. },
  11. });
  12. console.log(response);
  1. PUT _index_template/my-index-template
  2. {
  3. "index_patterns": ["my-data-stream*"],
  4. "data_stream": { },
  5. "composed_of": [ "my-mappings", "my-settings" ],
  6. "priority": 500,
  7. "_meta": {
  8. "description": "Template for my time series data",
  9. "my-custom-meta-field": "More arbitrary metadata"
  10. }
  11. }

Create the data stream

Indexing requests add documents to a data stream. These requests must use an op_type of create. Documents must include a @timestamp field.

To automatically create your data stream, submit an indexing request that targets the stream’s name. This name must match one of your index template’s index patterns.

  1. resp = client.bulk(
  2. index="my-data-stream",
  3. operations=[
  4. {
  5. "create": {}
  6. },
  7. {
  8. "@timestamp": "2099-05-06T16:21:15.000Z",
  9. "message": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736"
  10. },
  11. {
  12. "create": {}
  13. },
  14. {
  15. "@timestamp": "2099-05-06T16:25:42.000Z",
  16. "message": "192.0.2.255 - - [06/May/2099:16:25:42 +0000] \"GET /favicon.ico HTTP/1.0\" 200 3638"
  17. }
  18. ],
  19. )
  20. print(resp)
  21. resp1 = client.index(
  22. index="my-data-stream",
  23. document={
  24. "@timestamp": "2099-05-06T16:21:15.000Z",
  25. "message": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736"
  26. },
  27. )
  28. print(resp1)
  1. response = client.bulk(
  2. index: 'my-data-stream',
  3. body: [
  4. {
  5. create: {}
  6. },
  7. {
  8. "@timestamp": '2099-05-06T16:21:15.000Z',
  9. message: '192.0.2.42 - - [06/May/2099:16:21:15 +0000] "GET /images/bg.jpg HTTP/1.0" 200 24736'
  10. },
  11. {
  12. create: {}
  13. },
  14. {
  15. "@timestamp": '2099-05-06T16:25:42.000Z',
  16. message: '192.0.2.255 - - [06/May/2099:16:25:42 +0000] "GET /favicon.ico HTTP/1.0" 200 3638'
  17. }
  18. ]
  19. )
  20. puts response
  21. response = client.index(
  22. index: 'my-data-stream',
  23. body: {
  24. "@timestamp": '2099-05-06T16:21:15.000Z',
  25. message: '192.0.2.42 - - [06/May/2099:16:21:15 +0000] "GET /images/bg.jpg HTTP/1.0" 200 24736'
  26. }
  27. )
  28. puts response
  1. const response = await client.bulk({
  2. index: "my-data-stream",
  3. operations: [
  4. {
  5. create: {},
  6. },
  7. {
  8. "@timestamp": "2099-05-06T16:21:15.000Z",
  9. message:
  10. '192.0.2.42 - - [06/May/2099:16:21:15 +0000] "GET /images/bg.jpg HTTP/1.0" 200 24736',
  11. },
  12. {
  13. create: {},
  14. },
  15. {
  16. "@timestamp": "2099-05-06T16:25:42.000Z",
  17. message:
  18. '192.0.2.255 - - [06/May/2099:16:25:42 +0000] "GET /favicon.ico HTTP/1.0" 200 3638',
  19. },
  20. ],
  21. });
  22. console.log(response);
  23. const response1 = await client.index({
  24. index: "my-data-stream",
  25. document: {
  26. "@timestamp": "2099-05-06T16:21:15.000Z",
  27. message:
  28. '192.0.2.42 - - [06/May/2099:16:21:15 +0000] "GET /images/bg.jpg HTTP/1.0" 200 24736',
  29. },
  30. });
  31. console.log(response1);
  1. PUT my-data-stream/_bulk
  2. { "create":{ } }
  3. { "@timestamp": "2099-05-06T16:21:15.000Z", "message": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736" }
  4. { "create":{ } }
  5. { "@timestamp": "2099-05-06T16:25:42.000Z", "message": "192.0.2.255 - - [06/May/2099:16:25:42 +0000] \"GET /favicon.ico HTTP/1.0\" 200 3638" }
  6. POST my-data-stream/_doc
  7. {
  8. "@timestamp": "2099-05-06T16:21:15.000Z",
  9. "message": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736"
  10. }

You can also manually create the stream using the create data stream API. The stream’s name must still match one of your template’s index patterns.

  1. resp = client.indices.create_data_stream(
  2. name="my-data-stream",
  3. )
  4. print(resp)
  1. response = client.indices.create_data_stream(
  2. name: 'my-data-stream'
  3. )
  4. puts response
  1. const response = await client.indices.createDataStream({
  2. name: "my-data-stream",
  3. });
  4. console.log(response);
  1. PUT _data_stream/my-data-stream

Secure the data stream

Use index privileges to control access to a data stream. Granting privileges on a data stream grants the same privileges on its backing indices.

For an example, see Data stream privileges.

Convert an index alias to a data stream

Prior to Elasticsearch 7.9, you’d typically use an index alias with a write index to manage time series data. Data streams replace this functionality, require less maintenance, and automatically integrate with data tiers.

To convert an index alias with a write index to a data stream with the same name, use the migrate to data stream API. During conversion, the alias’s indices become hidden backing indices for the stream. The alias’s write index becomes the stream’s write index. The stream still requires a matching index template with data stream enabled.

  1. resp = client.indices.migrate_to_data_stream(
  2. name="my-time-series-data",
  3. )
  4. print(resp)
  1. const response = await client.indices.migrateToDataStream({
  2. name: "my-time-series-data",
  3. });
  4. console.log(response);
  1. POST _data_stream/_migrate/my-time-series-data

Get information about a data stream

To get information about a data stream in Kibana, open the main menu and go to Stack Management > Index Management. In the Data Streams view, click the data stream’s name.

You can also use the get data stream API.

  1. resp = client.indices.get_data_stream(
  2. name="my-data-stream",
  3. )
  4. print(resp)
  1. response = client.indices.get_data_stream(
  2. name: 'my-data-stream'
  3. )
  4. puts response
  1. const response = await client.indices.getDataStream({
  2. name: "my-data-stream",
  3. });
  4. console.log(response);
  1. GET _data_stream/my-data-stream

Delete a data stream

To delete a data stream and its backing indices in Kibana, open the main menu and go to Stack Management > Index Management. In the Data Streams view, click the trash icon. The icon only displays if you have the delete_index security privilege for the data stream.

You can also use the delete data stream API.

  1. resp = client.indices.delete_data_stream(
  2. name="my-data-stream",
  3. )
  4. print(resp)
  1. response = client.indices.delete_data_stream(
  2. name: 'my-data-stream'
  3. )
  4. puts response
  1. const response = await client.indices.deleteDataStream({
  2. name: "my-data-stream",
  3. });
  4. console.log(response);
  1. DELETE _data_stream/my-data-stream