Release notes - Elasticsearch version 7.9.0 - 《Elasticsearch v7.9 Reference》

Elasticsearch version 7.9.0

Elasticsearch version 7.9.0

Also see Breaking changes in 7.9.

Security updates

A field disclosure flaw was found in Elasticsearch when running a scrolling search with field level security. If a user runs the same query another more privileged user recently ran, the scrolling search can leak fields that should be hidden. This could result in an attacker gaining additional permissions against a restricted index. All versions of Elasticsearch before 7.9.0 and 6.8.12 are affected by this flaw. You must upgrade to Elasticsearch version 7.9.0 or 6.8.12 to obtain the fix. CVE-2020-7019

Known issues

Upgrading to 7.9.0 from an earlier version will result in incorrect mappings on the machine learning annotations index, and possibly also on the machine learning config index. This will lead to some pages in the machine learning UI not displaying correctly, and may prevent machine learning jobs being created or updated. The best way to avoid this problem if you read about this known issue before upgrading is to manually update the mappings on these indices in your old Elasticsearch version before upgrading to 7.9.0. If you find out about the issue after upgrading then reindexing is required to recover. Full details of the mitigations are in Upgrade to 7.9.0 causes incorrect mappings.
Lucene 8.6.0, on which Elasticsearch 7.9.0 is based, contains a memory leak. This memory leak manifests in Elasticsearch when a single document is updated repeatedly with a forced refresh. The cluster state storage layer in Elasticsearch is based on Lucene and does use single-document updates with forced refreshes, meaning that this memory leak manifests in Elasticsearch under normal conditions. It also manifests when user-controlled workloads update a single document in an index repeatedly with a forced refresh. In both cases, the memory leak is around 500 bytes per update, so it does take some time for the leak to show any meaningful impact on the system. Symptoms of this memory leak are the size of the used heap slowly rising over time, requests eventually being rejected by the real memory circuit breaker, and potentially out-of-memory errors. A workaround is to restart any nodes exhibiting these symptoms. We are actively working with the Lucene community to release a fix in Lucene 8.6.2 to deliver in Elasticsearch 7.9.1 that will address this memory leak.

Breaking changes

Script Cache

Script cache size and rate limiting are per-context #55753 (issue: #50152)

Field capabilities API

Constant_keyword fields are now described by their family type keyword instead of constant_keyword #58483 (issue: #53175)

Snapshot restore throttling

Restoring from a snapshot (which is a particular form of recovery) is now properly taking recovery throttling into account (i.e. the indices.recovery.max_bytes_per_sec setting). The max_restore_bytes_per_sec setting is also now defaulting to unlimited, whereas previously it was set to 40mb, which is the default that’s used for indices.recovery.max_bytes_per_sec. This means that no behavioral change will be observed by clusters where the recovery and restore settings had not been adapted from the defaults. #58658

Thread pool write queue size

The WRITE thread pool default queue size (thread_pool.write.size) has been increased from 200 to 10000. A small queue size (200) caused issues when users wanted to send small indexing requests with a high client count. Additional memory-oriented back pressure has been introduced with the indexing_pressure.memory.limit setting. This setting configures a limit to the number of bytes allowed to be consumed by outstanding indexing requests. #59263

Dangling indices

Automatically importing dangling indices is now deprecated, disabled by default, and will be removed in Elasticsearch 8.0. See the migration notes. #58176 #58898 (issue: #48366)

Breaking Java changes

Aggregations

Improve cardinality measure used to build aggs #56533 (issue: #56487)

Features/Ingest

Add optional description parameter to ingest processors. #57906 (issue: #56000)

New features

Aggregations

Add moving percentiles pipeline aggregation #55441 (issue: #49452)
Add normalize pipeline aggregation #56399 (issue: #51005)
Add variable width histogram aggregation #42035 (issues: #9572, #50863)
Add pipeline inference aggregation #58193
Speed up time interval arounding around daylight savings time (DST) #56371 (issue: #55559)

Geo

Override doc_value parameter in Spatial XPack module #53286 (issue: #37206)

Machine Learning

Add update data frame analytics jobs API #58302 (issue: #45720)
Introduce model_plot_config.annotations_enabled setting for anomaly detection jobs #57539 (issue: #55781)
Report significant changes to anomaly detection models in annotations of the results #1247, #56342, #56417, #57144, #57278, #57539

Mapping

Merge mappings for composable index templates #58521 (issue: #53101)
Wildcard field optimised for wildcard queries #49993 (issue: #48852)

Allow index filtering in field capabilities API #57276 (issue: #56195)

Enhancements

Aggregations

Add support for numeric range keys #56452 (issue: #56402)
Added standard deviation / variance sampling to extended stats #49782 (issue: #49554)
Give significance lookups their own home #57903
Increase search.max_buckets to 65,535 #57042 (issue: #51731)
Optimize date_histograms across daylight savings time #55559
Return clear error message if aggregation type is invalid #58255 (issue: #58146)
Save memory on numeric significant terms when not top #56789 (issue: #55873)
Save memory when auto_date_histogram is not on top #57304 (issue: #56487)
Save memory when date_histogram is not on top #56921 (issues: #55873, #56487)
Save memory when histogram agg is not on top #57277
Save memory when numeric terms agg is not top #55873
Save memory when parent and child are not on top #57892 (issue: #55873)
Save memory when rare_terms is not on top #57948 (issue: #55873)
Save memory when significant_text is not on top #58145 (issue: #55873)
Save memory when string terms are not on top #57758
Speed up reducing auto_date_histo with a time zone #57933 (issue: #56124)
Speed up rounding in auto_date_histogram #56384 (issue: #55559)

Allocation

Account for remaining recovery in disk allocator #58029

Analysis

Add max_token_length setting to the CharGroupTokenizer #56860 (issue: #56676)
Expose discard_compound_token option to kuromoji_tokenizer #57421
Support multiple tokens on LHS in stemmer_override rules (#56113) #56484 (issue: #56113)

Authentication

Add http proxy support for OIDC realm #57039 (issue: #53379)
Improve threadpool usage and error handling for API key validation #58090 (issue: #58088)
Support handling LogoutResponse from SAML idP #56316 (issues: #40901, #43264)

Authorization

Add cache for application privileges #55836 (issue: #54317)
Add monitor and view_index_metadata privileges to built-in kibana_system role #57755
Improve role cache efficiency for API key roles #58156 (issue: #53939)

CCR

Allow follower indices to override leader settings #58103

CRUD

Retry failed replication due to transient errors #55633

Engine

Don’t log on RetentionLeaseSync error handler after an index has been deleted #58098 (issue: #57864)

Features/Data streams

Add support for snapshot and restore to data streams #57675 (issues: #53100, #57127)
Data stream creation validation allows for prefixed indices #57750 (issue: #53100)
Disallow deletion of composable template if in use by data stream #57957 (issue: #57004)
Validate alias operations don’t target data streams #58327 (issue: #53100)

Features/ILM+SLM

Add data stream support to searchable snapshot action #57873 (issue: #53100)
Add data stream support to the shrink action #57616 (issue: #53100)
Add support for rolling over data streams #57295 (issues: #53100, #53488)
Check the managed index is not a data stream write index #58239 (issue: #53100)

Features/Indices APIs

Add default composable templates for new indexing strategy #57629 (issue: #56709)
Add index block api #58094
Add new flag to check whether alias exists on remove #58100
Add prefer_v2_templates parameter to reindex #56253 (issue: #53101)
Add template simulation API for simulating template composition #56842 (issues: #53101, #55686, #56255, #56390)

Features/Ingest

Add ignore_empty_value parameter in set ingest processor #57030 (issue: #54783)
Support if_seq_no and if_primary_term for ingest #55430 (issue: #41255)

Features/Java High Level REST Client

Add support for data streams #58106 (issue: #53100)
Enable decompression of response within LowLevelRestClient #55413 (issues: #24349, #53555)

Features/Java Low Level REST Client

Add isRunning method to RestClient #57973 (issue: #42133)
Add RequestConfig support to RequestOptions #57972

Infra/Circuit Breakers

Enhance real memory circuit breaker with G1 GC #58674 (issue: #57202)

Infra/Core

Introduce node.roles setting #54998

Infra/Packaging

Remove DEBUG-level logging from actions in Docker #57389 (issues: #51198, #51459)

Infra/Plugins

Improved ExtensiblePlugin #58234

Infra/Resiliency

Adds resiliency to read-only filesystems #45286 #52680 (issue: #45286)

Machine Learning

Accounting for model size when models are not cached. #58670
Adds new for_export flag to GET _ml/inference API #57351
Adds WKT geometry detection in find_file_structure #57014 (issue: #56967)
Calculate cache misses for inference and return in stats #58252
Delete auto-generated annotations when job is deleted. #58169 (issue: #57976)
Delete auto-generated annotations when model snapshot is reverted #58240 (issue: #57982)
Delete expired data by job #57337
Introduce Annotation.event field #57144 (issue: #55781)
Add support for larger forecasts in memory via max_model_memory setting #1238, #57254
Don’t lose precision when saving model state #1274
Parallelize the feature importance calculation for classification and regression over trees #1277
Add an option to do categorization independently for each partition #1293, #1318, #1356, #57683
Memory usage is reported during job initialization #1294
More realistic memory estimation for classification and regression means that these analyses will require lower memory limits than before #1298
Checkpoint state to allow efficient failover during coarse parameter search for classification and regression #1300
Improve data access patterns to speed up classification and regression #1312
Performance improvements for classification and regression, particularly running multithreaded #1317
Improve runtime and memory usage training deep trees for classification and regression #1340
Improvement in handling large inference model definitions #1349
Add a peak_model_bytes field to model_size_stats #1389

Mapping

Add regex query support to wildcard field #55548 (issue: #54725)
Make keyword a family of field types #58315 (issue: #53175)
Store parsed mapping settings in IndexSettings #57492 (issue: #57395)
Wildcard field - add support for custom null values #57047

Network

Make the number of transport threads equal to the number of available CPUs #56488

Recovery

Implement dangling indices API #50920 (issue: #48366)
Reestablish peer recovery after network errors #55274
Sending operations concurrently in peer recovery #58018 (issue: #58011)

Reindex

Throw an illegal_argument_exception when max_docs is less than slices #54901 (issue: #52786)

SQL

Implement TIME_PARSE function for parsing strings into TIME values #55223 (issues: #54963, #55095)
Implement TOP as an alternative to LIMIT #57428 (issue: #41195)
Implement TRIM function #57518 (issue: #41195)
Improve performances of LTRIM/RTRIM #57603 (issue: #57594)
Make CASTing string to DATETIME more lenient #57451
Redact credentials in connection exceptions #58650 (issue: #56474)
Relax parsing of date/time escaped literals #58336 (issue: #58262)
Add support for scalars within LIKE/RLIKE #56495 (issue: #55058)

Add description to submit and get async search, as well as cancel tasks #57745
Add matchBoolPrefix static method in query builders #58637 (issue: #58388)
Add range query support to wildcard field #57881 (issue: #57816)
Group docIds by segment in FetchPhase to better use LRU cache #57273
Improve error handling when decoding async execution ids #56285
Specify reason whenever async search gets cancelled #57761
Use index sort range query when possible. #56657 (issue: #48665)

Security

Add machine learning admin permissions to the kibana_system role #58061
Just log 401 stacktraces #55774

Snapshot/Restore

Deduplicate Index Metadata in BlobStore #50278 (issues: #45736, #46250, #49800)
Default to zero replicas for searchable snapshots #57802 (issue: #50999)
Enable fully concurrent snapshot operations #56911
Support cloning of searchable snapshot indices #56595
Track GET/LIST Azure Storage API calls #56773
Track GET/LIST GoogleCloudStorage API calls #56585
Track PUT/PUT_BLOCK operations on AzureBlobStore. #56936
Track multipart/resumable uploads GCS API calls #56821
Track upload requests on S3 repositories #56826

Task Management

Add index name to refresh mapping task #57598
Cancel task and descendants on channel disconnects #56620 (issues: #56327, #56619)

Transform

Add support for terms agg in transforms #56696
Adds geotile_grid support in group_by #56514 (issue: #56121)

Bug fixes

Aggregations

Fix auto_date_histogram interval #56252 (issue: #56116)
Fix bug in faster interval rounding #56433 (issue: #56400)
Fix bug in parent and child aggregators when parent field not defined #57089 (issue: #42997)
Fix missing null values for std_deviation_bounds in ext. stats aggs #58000

Allocation

Reword INDEX_READ_ONLY_ALLOW_DELETE_BLOCK message #58410 (issues: #42559, #50166, #58376)

Authentication

Map only specific type of OIDC Claims #58524

Authorization

Change privilege of enrich stats API to monitor #52027 (issue: #51677)

Engine

Fix local translog recovery not updating safe commit in edge case #57350 (issue: #57010)
Hide AlreadyClosedException on IndexCommit release #57986 (issue: #57797)

Features/ILM+SLM

Normalized prefix for rollover API #57271 (issue: #53388)

Features/Indices APIs

Don’t allow invalid template combinations #56397 (issues: #53101, #56314)
Handle cluster.max_shards_per_node in YAML config #57234 (issue: #40803)

Features/Ingest

Fix ingest simulate verbose on failure with conditional #56478 (issue: #56004)

Geo

Check for degenerated lines when calculating the centroid #58027 (issue: #55851)
Fix bug in circuit-breaker check for geoshape grid aggregations #57962 (issue: #57847)

Infra/Scripting

Fix source return bug in scripting #56831 (issue: #52103)

Machine Learning

Fix wire serialization for flush acknowledgements #58413
Make waiting for renormalization optional for internally flushing job #58537 (issue: #58395)
Tail the C++ logging pipe before connecting other pipes #56632 (issue: #56366)
Fix numerical issues leading to blow up of the model plot bounds #1268
Fix causes for inverted forecast confidence interval bounds #1369 (issue: #1357)
Restrict growth of max matching string length for categories #1406

Mapping

Wildcard field fix for scripts - changed value type from BytesRef to String #58060 (issue: #58044)

SQL

Introduce JDBC option for meta pattern escaping #40661 (issue: #40640)

Don’t omit empty arrays when filtering _source #56527 (issues: #20736, #22593, #23796)
Fix casting of scaled_float in sorts #57207

Snapshot/Restore

Account for recovery throttling when restoring snapshot #58658 (issue: #57023)
Fix noisy logging during snapshot delete #56264
Fix S3ClientSettings leak #56703 (issue: #56702)

Upgrades

Update to lucene snapshot e7c625430ed #57981