Enabling GPORCA
Precompiled versions of HAWQ that include the GPORCA query optimizer enable it by default, no additional configuration is required. To use the GPORCA query optimizer in a HAWQ built from source, your build must include GPORCA. You must also enable specific HAWQ server configuration parameters at or after install time:
- Set the
optimizer_analyze_root_partition
parameter toon
to enable statistics collection for the root partition of a partitioned table. - Set the
optimizer
parameter toon
to enable GPORCA. You can set the parameter at these levels:
Important: If you intend to execute queries on partitioned tables with GPORCA enabled, you must collect statistics on the partitioned table root partition with the ANALYZE ROOTPARTITION
command. The command ANALYZE ROOTPARTITION
collects statistics on the root partition of a partitioned table without collecting statistics on the leaf partitions. If you specify a list of column names for a partitioned table, the statistics for the columns and the root partition are collected. For information on the ANALYZE
command, see ANALYZE.
You can also use the HAWQ utility analyzedb
to update table statistics. The HAWQ utility analyzedb
can update statistics for multiple tables in parallel. The utility can also check table statistics and update statistics only if the statistics are not current or do not exist. For information about the analyzedb
utility, see analyzedb.
As part of routine database maintenance, you should refresh statistics on the root partition when there are significant changes to child leaf partition data.
Setting the optimizer_analyze_root_partition Parameter
When the configuration parameter optimizer_analyze_root_partition
is set to on
, root partition statistics will be collected when ANALYZE
is run on a partitioned table. Root partition statistics are required by GPORCA.
You will perform different procedures to set optimizer configuration parameters for your whole HAWQ cluster depending upon whether you manage your cluster from the command line or use Ambari. If you use Ambari to manage your HAWQ cluster, you must ensure that you update server configuration parameters only via the Ambari Web UI. If you manage your HAWQ cluster from the command line, you will use the hawq config
command line utility to set optimizer server configuration parameters.
If you use Ambari to manage your HAWQ cluster:
- Set the
optimizer_analyze_root_partition
configuration property toon
via the HAWQ service Configs > Advanced > Custom hawq-site drop down. - Select Service Actions > Restart All to load the updated configuration.
If you manage your HAWQ cluster from the command line:
Log in to the HAWQ master host as a HAWQ administrator and source the file
/usr/local/hawq/greenplum_path.sh
.$ source /usr/local/hawq/greenplum_path.sh
Use the
hawq config
utility to setoptimizer_analyze_root_partition
:$ hawq config -c optimizer_analyze_root_partition -v on
Reload the HAWQ configuration:
$ hawq stop cluster -u
Enabling GPORCA for a System
Set the server configuration parameter optimizer
for the HAWQ system.
If you use Ambari to manage your HAWQ cluster:
- Set the
optimizer
configuration property toon
via the HAWQ service Configs > Advanced > Custom hawq-site drop down. - Select Service Actions > Restart All to load the updated configuration.
If you manage your HAWQ cluster from the command line:
Log in to the HAWQ master host as a HAWQ administrator and source the file
/usr/local/hawq/greenplum_path.sh
.$ source /usr/local/hawq/greenplum_path.sh
Use the
hawq config
utility to setoptimizer
:$ hawq config -c optimizer -v on
Reload the HAWQ configuration:
$ hawq stop cluster -u
Enabling GPORCA for a Database
Set the server configuration parameter optimizer
for individual HAWQ databases with the ALTER DATABASE
command. For example, this command enables GPORCA for the database test_db.
=> ALTER DATABASE test_db SET optimizer = ON ;
Enabling GPORCA for a Session or a Query
You can use the SET
command to set optimizer
server configuration parameter for a session. For example, after you use the psql
utility to connect to HAWQ, this SET
command enables GPORCA:
=> SET optimizer = on ;
To set the parameter for a specific query, include the SET
command prior to running the query.