Github
来源:PrestoDB
浏览 445
扫码
分享
2019-11-30 14:23:26
11.9. Release 0.221
General Changes
- Fix error during stats collection phase of query planning.
- Fix a performance regression for some outer joins without equality predicates when
join_distribution_type
is set to AUTOMATIC
. - Improve performance for queries that have constant
VARCHAR
predicates on join columns. - Add a variant of
strpos()
that returns the position of the N-th instance of the substring. - Add
strrpos()
that returns the position of the N-th instance of a substring from the back of a string. - Add aggregation function
entropy()
. - Add classification aggregation functions
classification_miss_rate()
, classification_precision()
,classification_recall()
, classification_thresholds()
. - Add overload of
approx_set()
which takes in the maximum standard error. - Add
max_tasks_per_stage
session property and stage.max-tasks-per-stage
config property tolimit the number of tasks per stage for grouped execution. Setting this session property allows queriesrunning with grouped execution to use a predictable amount of memory independent of the cluster size. - Add encryption for spill files (see Spill to Disk).
Web UI Changes
- Add information about query warnings to the web UI.
Raptor Changes
- Revert the change introduced in 0.219 to rebalance bucket assignment after restartingthe cluster. Automatic rebalancing can cause unexpected downtime when restarting the clusterto resolve emergent issues.
Hive Connector Changes
- Improve coordinator memory utilization for Hive splits.
- Improve performance of writing large ORC files.
SPI Changes
- Add
PageSinkProperties
for createPageSink
in PageSinkProvider
andConnectorPageSinkProvider
. It contains a boolean partitionCommitRequired
, which isfalse by default. See the note below about commitPartition
for more information. - Add
commitPartition
to Metadata
and ConnectorMetadata
. This SPI is coupled withPageSinkProperties#partitionCommitRequired
and is used by the engine to commit a partition of data to the targetconnector. The connector that implements this SPI should ensure that if PageSinkProperties#isPartitionCommitRequired
is true in ConnectorPageSinkProvider#createPageSink
, the written data is not published untilConnectorMetadata#commitPartition
is called. Also, it is expected for the connector to add SUPPORTS_PARTITION_COMMIT
in Connector#getCapabilities
. - Add
ExpressionOptimizer
in RowExpressionService
. ExpressionOptimizer
simplifies a RowExpression
and prunes redundant part of it. - Add
pushNegationToLeaves
method to LogicalRowExpressions
to push negation down below conjunction or disjunctionfor a logical expression. - Replace
SplitSchedulingStrategy
with SplitSchedulingContext
in ConnectorSplitManager
. SplitSchedulingContext
contains the SplitSchedulingStrategy
and a boolean schedulerUsesHostAddresses
that indicates whether the network topologyis used during scheduling. If false, the connector doesn’t need to provide the host addresses for remotely accessible splits.