11.138. Release 0.103

11.138. Release 0.103

Cluster Resource Management

There is a new cluster resource manager, which can be enabled via theexperimental.cluster-memory-manager-enabled flag. Currently, the onlyresource that’s tracked is memory, and the cluster resource manager guaranteesthat the cluster will not deadlock waiting for memory. However, in a low memorysituation it is possible that only one query will make progress. Memory limits cannow be configured via query.max-memory which controls the total distributedmemory a query may use and query.max-memory-per-node which limits the amountof memory a query may use on any one node. On each worker, theresources.reserved-system-memory flags controls how much memory is reservedfor internal Presto data structures and temporary allocations.

Task Parallelism

Queries involving a large number of aggregations or a large hash table for ajoin can be slow due to single threaded execution in the intermediate stages.This release adds experimental configuration and session properties to executethis single threaded work in parallel. Depending on the exact query this mayreduce wall time, but will likely increase CPU usage.

Use the configuration parameter task.default-concurrency or the sessionproperty task_default_concurrency to set the default number of parallelworkers to use for join probes, hash builds and final aggregations.Additionally, the session properties task_join_concurrency,task_hash_build_concurrency and task_aggregation_concurrency can beused to control the parallelism for each type of work.

This is an experimental feature and will likely change in a future release. Itis also expected that this will eventually be handled automatically by thequery planner and these options will be removed entirely.

Hive Changes

Removed the hive.max-split-iterator-threads parameter and renamedhive.max-global-split-iterator-threads to hive.max-split-iterator-threads.
Fix excessive object creation when querying tables with a large number of partitions.
Do not retry requests when an S3 path is not found.

General Changes

Add array_remove().
Fix NPE in max_by() and min_by() caused when few rows were present in the aggregation.
Reduce memory usage of map_agg().
Change HTTP client defaults: 2 second idle timeout, 10 second requesttimeout and 250 connections per host.
Add SQL command autocompletion to CLI.
Increase CLI history file size.