- Upgrading Impala
- Upgrading Impala
- Impala Upgrade Considerations
- Grant REFRESH Privilege to Impala Roles with SELECT or INSERT Privilege when Upgrading to Impala 3.0
- List of Reserved Words Updated in Impala 3.0
- Decimal V2 Used by Default in Impala 3.0
- Behavior of Column Aliases Changed in Impala 3.0
- Default PARQUET_ARRAY_RESOLUTION Changed in Impala 3.0
- Enable Clustering Hint for Inserts
- Deprecated Query Options Removed in Impala 3.0
- Fine-grained Privileges Added in Impala 3.0
- refresh_after_connect Impala Shell Option Removed in Impala 3.0
- Return Type Changed for EXTRACT and DATE_PART Functions in Impala 3.0
- Port Change for SHUTDOWN Command
- Change in Client Connection Timeout
- Default Setting Changes
Upgrading Impala
Upgrading Impala involves building or acquiring new Impala-related binaries, and then restarting Impala services.
Upgrading Impala
Shut down all Impala-related daemons on all relevant hosts in the cluster:
Stop
impalad
on each Impala node in your cluster:$ sudo service impala-server stop
Stop any instances of the state store in your cluster:
$ sudo service impala-state-store stop
Stop any instances of the catalog service in your cluster:
$ sudo service impala-catalog stop
Follow the build procedure in the README.md file to produce new Impala binaries.
Replace the binaries for all Impala-related daemons on all relevant hosts in the cluster.
Check if there are new recommended or required configuration settings to put into place in the configuration files, typically under /etc/impala/conf. See Post-Installation Configuration for Impala for settings related to performance and scalability.
Restart all Impala-related daemons on all relevant hosts in the cluster:
Restart the Impala state store service on the desired nodes in your cluster. Expect to see a process named
statestored
if the service started successfully.$ sudo service impala-state-store start
$ ps ax | grep [s]tatestored
6819 ? Sl 0:07 /usr/lib/impala/sbin/statestored -log_dir=/var/log/impala -state_store_port=24000
Restart the state store service before the Impala server service to avoid “Not connected” errors when you run
impala-shell
.Restart the Impala catalog service on whichever host it runs on in your cluster. Expect to see a process named
catalogd
if the service started successfully.$ sudo service impala-catalog restart
$ ps ax | grep [c]atalogd
6068 ? Sl 4:06 /usr/lib/impala/sbin/catalogd
Restart the Impala daemon service on each node in your cluster. Expect to see a process named
impalad
if the service started successfully.$ sudo service impala-server start
$ ps ax | grep [i]mpalad
7936 ? Sl 0:12 /usr/lib/impala/sbin/impalad -log_dir=/var/log/impala -state_store_port=24000
-state_store_host=127.0.0.1 -be_port=22000
Note:
If the services did not start successfully (even though the sudo service
command might display [OK]
), check for errors in the Impala log file, typically in /var/log/impala.
Impala Upgrade Considerations
Grant REFRESH Privilege to Impala Roles with SELECT or INSERT Privilege when Upgrading to Impala 3.0
To use the fine grained privileges feature in Impala 3.0, if a role has the SELECT
or INSERT
privilege on an object in Impala before upgrading to Impala 3.0, grant that role the REFRESH
privilege after the upgrade.
List of Reserved Words Updated in Impala 3.0
The list of reserved words in Impala was updated in Impala 3.0. If you need to use a reserved word as an identifier, e.g. a table name, enclose the word in back-ticks.
If you need to use the reserved words from previous versions of Impala, set the impalad
and catalogd
startup flag.
‑‑reserved_words_version=2.11.0
Note that this startup option will be deprecated in a future release.
Decimal V2 Used by Default in Impala 3.0
In Impala, two different implementations of DECIMAL
types are supported. Starting in Impala 3.0, DECIMAL
V2 is used by default. See DECIMAL Type for detail information.
If you need to continue using the first version of the DECIMAL
type for the backward compatibility of your queries, set the DECIMAL_V2
query option to FALSE
:
SET DECIMAL_V2=FALSE;
Behavior of Column Aliases Changed in Impala 3.0
To conform to the SQL standard, Impala no longer performs alias substitution in the subexpressions of GROUP BY
, HAVING
, and ORDER BY
. See Overview of Impala Aliases for examples of supported and unsupported aliases syntax.
Default PARQUET_ARRAY_RESOLUTION Changed in Impala 3.0
The default value for the PARQUET_ARRAY_RESOLUTION
was changed to THREE_LEVEL
in Impala 3.0, to match the Parquet standard 3-level encoding.
See [PARQUET_ARRAY_RESOLUTION Query Option (Impala 2.9 or higher only)]($55c9edca6db8ced4.md)
for the information about the query option.
Enable Clustering Hint for Inserts
In Impala 3.0, the clustered hint is enabled by default. The hint adds a local sort by the partitioning columns to a query plan.
The clustered
hint is only effective for HDFS and Kudu tables.
As in previous versions, the noclustered
hint prevents clustering. If a table has ordering columns defined, the noclustered
hint is ignored with a warning.
Deprecated Query Options Removed in Impala 3.0
The following query options have been deprecated for several releases and removed:
DEFAULT_ORDER_BY_LIMIT
ABORT_ON_DEFAULT_LIMIT_EXCEEDED
V_CPU_CORES
RESERVATION_REQUEST_TIMEOUT
RM_INITIAL_MEM
SCAN_NODE_CODEGEN_THRESHOLD
MAX_IO_BUFFERS
RM_INITIAL_MEM
DISABLE_CACHED_READS
Fine-grained Privileges Added in Impala 3.0
Starting in Impala 3.0, finer grained privileges are enforced, such as the REFRESH
, CREATE
, DROP
, and ALTER
privileges. In particular, running REFRESH
or INVALIDATE METADATA
now requires the new REFRESH
privilege. Users who did not previously have the ALL
privilege will no longer be able to run REFRESH
or INVALIDATE METADATA
after an upgrade. Those users need to have the REFRESH
or ALL
privilege granted to run REFRESH
or INVALIDATE METADATA
.
See GRANT Statement (Impala 2.0 or higher only) for the new privileges, the scope, and other information about the new privileges.
refresh_after_connect Impala Shell Option Removed in Impala 3.0
The deprecated ‑‑refresh_after_connect
option was removed from Impala Shell in Impala 3.0
Return Type Changed for EXTRACT and DATE_PART Functions in Impala 3.0
The following changes were made to the EXTRACT
and DATE_PART
functions:
- The output type of the
EXTRACT
andDATE_PART
functions was changed toBIGINT
. - Extracting the millisecond part from a
TIMESTAMP
returns the seconds component and the milliseconds component. For example,EXTRACT (CAST('2006-05-12 18:27:28.123456789' AS TIMESTAMP), 'MILLISECOND')
will return28123
.
Port Change for SHUTDOWN Command
If you used the SHUTDOWN
command in Impala 3.1, and specified a port explicitly, change the port number parameter, in Impala 3.2, to use the KRPC port.
Change in Client Connection Timeout
The default behavior of client connection timeout changed.
In Impala 3.2 and lower, client waited indefinitely to open the new session if the maximum number of threads specified by --fe_service_threads
has been allocated.
In Impala 3.3 and higher, a new startup flag, --accepted_client_cnxn_timeout
, was added to control how the server should treat new connection requests if we have run out of the configured number of server threads.
If --accepted_client_cnxn_timeout > 0
, new connection requests are rejected after the specified timeout.
If --accepted_client_cnxn_timeout=0
, clients waits indefinitely to connect to Impala. You can use this setting to restore the pre-Impala 3.3 behavior.
The default timeout is 5 minutes.
Default Setting Changes
Release Changed | Setting | Default Value |
---|---|---|
Impala 2.12 | ‑‑compact_catalog_topic impalad flag | true |
Impala 2.12 | ‑‑max_cached_file_handles impalad flag | 20000 |
Impala 3.0 | PARQUET_ARRAY_RESOLUTION query option | THREE_LEVEL |
Impala 3.0 | DECIMAL_V2 | TRUE |