Release 0.281
Details
General Changes
Fix memory accounting for hash join.
Improve outer join where the join key is from the output of another outer join which can have many null values. The optimization is controlled by session property
randomize_outer_join_null_key_strategy
and the default value isDISABLED
Add timeouts for long running optimization rules.
Improve performance for queries with multiple similar aggregations, with and without filters. The optimization is controlled by session parameter merge_aggs_with_and_without_filter and by default it is set to false.* Add an optimization rule to merge aggregations when there are multiple duplicate aggregations in the aggregation node. The optimization is controlled by session property
merge_duplicate_aggregations
and default value istrue
.Add f_cdf() and inverse_f_cdf() functions.
Add
gamma_cdf()`()
and inverse_gamma_cdf() functions.Add groups support to Window Functions.
Add optimization for joins when build side is empty at runtime. The optimization is controlled by session parameter
optimize_join_probe_for_empty_build_runtime
and by default it is set tofalse
.Add system config to enable access log on presto-on-spark native. This can be enabled with the system config
http-server.enable-access-log
and default value istrue
.Remove session property
deprecated.legacy-date-timestamp-to-varchar-coercion
.Remove access control checks after semantic analysis. Queries with both semantic and permission issues that previously failed with permission errors may now fail with semantic errors.
Reduce memory usage in parquet reader.
Resource Groups Changes
- Add ability to express concurrency levels as a function of number of available workers.
workersPerQueryLimit
, which specifies the minimum number of workers that have to be available for each query. This is optional and defaults to 0. See Resource Groups.
SPI Changes
Add various methods to create variables in the
VariableAllocator
.VariableAllocator
is also changed from interface to a class.Add
MetadataResolver
in theAnalyzerContext
, which is passed in the pluggable analyzer interface.
Hive Changes
Fix a bug where the filter on a
CHAR(n)
column is not evaluated correctly for ORC/DWRF files when filter pushdown is enabled.Fix issue while accessing HUDI Merge-on-Read Realtime tables (#18911).
Enable Hive splits for uncompressed inputs in S3 Select connector by leveraging the scan range feature of the service.
Apache Hudi Changes
Add the asynchronous split generation in Hudi connector to speed up the query execution and reduce overall query finishing time.
hudi.max-outstanding-splits
session property controls the maximum outstanding splits in a batch enqueued for processing.hudi.split-generator-parallelism
session property controls the number of threads to generate splits from partitions.Upgrade Apache Hudi version to 0.12.1.
Apache Iceberg Changes
- Upgrade Apache Iceberg version from 1.1.0 to 1.2.0.
JDBC Changes
- Add ConnectorSession parameter to all methods of
JdbcClient
interface. That makes it possible to pass specific options to JDBC driver implementation via session parameters.
Credits
Ajay George, Ali Parsaei, Amit Dutta, Anant Aneja, Ankur Pathela, Arun Thirupathi, Chandrashekhar Kumar Singh, Chunxu Tang, Deepak Majeti, Eduard Tudenhoefner, Ge Gao, Ivan Sadikov, Jacob Wujciak-Jens, Jalpreet Singh Nanda (:imjalpreet), James Petty, Jaromir Vanek, Jialiang Tan, Jimmy Lu, Ke, Krishna Pai, Krishna Pai, Linsong Wang, Lyublena Antova, Masha Basmanova, Miaojiang (MJ) Deng, Michael Shang, Paul Meng, Pedro Pedreira, Pranjal Shankhdhar, Pratyaksh Sharma, Pratyush Verma, Rebecca Schlussel, Reetika Agrawal, Rohit Jain, Ruslan Mardugalliamov, Sagar Sumit, Sergey Pershin, Sergii Druzkin, Shrinidhi Joshi, Sreeni Viswanadha, Tal Galili, Timothy Meehan, Vivek, Zhenxiao Luo, dnnanuti, feilong-liu, guhanjie, meng duan, patzar, rohanpednekar, vinoth chandar, xiaoxmeng, yingsu00