gpdbrestore
Restores a database from a set of dump files generated by gpcrondump.
Note: This utility is deprecated and will not be supported after the end of Greenplum Database 5.x Support Life.
Synopsis
gpdbrestore { -t <timestamp_key> { [-L] |
[--netbackup-service-host <netbackup_server>
[--netbackup-block-size <size>] ] }
-b <YYYYMMDD> | -R <hostname>:<path_to_dumpset> | -s <database_name> }
[--noplan] [--noanalyze] [-u backup_directory] [--list-backup]
[--prefix prefix_string] [--report-status-dir <report_directory>]
[-S <schema_name >]
[-T <schema.table >] [--table-file <file_name>] [--truncate] [-e]
[-m]
[-restore-stats [include | only]]
[-G [include | only]] [--change-schema <schema_name>]
[-B <parallel_processes>] [-d <master_data_directory>] [-a] [-q]
[-l <logfile_directory>] [-v]
[--ddboost [--ddboost-storage-unit=<unit-ID>] ]
[--redirect <database_name> ]
gpdbrestore -?
gpdbrestore --version
Description
The gpdbrestore
utility recreates the data definitions (schema) and user data in a Greenplum database using the script files created by gpcrondump operations.
When you restore from an incremental backup, the gpdbrestore
utility assumes the complete backup set is available. The complete backup set includes the following backup files:
- The last full backup before the specified incremental backup
- All incremental backups created between the time of the full backup the specified incremental backup
The gpdbrestore
utility provides the following functionality:
Automatically reconfigures for compression.
Validates the number of dump files are correct (for primary only, mirror only, primary and mirror, or a subset consisting of some mirror and primary segment dump files).
If a failed segment is detected, restores to active segment instances.
Except when restoring data from a NetBackup server, you do not need to know the complete timestamp key (
-t
) of the backup set to restore. Additional options are provided to instead give just a date (-b
), backup set directory location (-R
), or database name (-s
) to restore.The
-R
option allows the ability to restore from a backup set located on a host outside of the Greenplum Database array (archive host). Ensures that the correct dump file goes to the correct segment instance.Identifies the database name automatically from the backup set.
Allows you to restore particular tables only (
-T
option) instead of the entire database. Note that single tables are not automatically dropped or truncated prior to restore.Performs an
ANALYZE
operation on the tables that are restored. You can deactivate theANALYZE
operation by specifying the option--noanalyze
.Can restore global objects such as roles and tablespaces (
-G
option).Detects if the backup set is primary segments only or primary and mirror segments and performs the appropriate restore operation.
Allows you to drop the target database before a restore in a single operation.
The backups contain the database-level settings for the server configuration parameters gp_default_storage_options
, optimizer
, and search_path
, the settings are restored when you perform a restore operation and specify the -e
option to create an empty target database before performing a restore operation.
Important: When restoring table data to an existing table, the utility assumes that the database table definition is the same as the table that was backed up. The utility does not check the table definitions.
Database, Schema, and Table Names
You can specify names of databases, schemas, and tables that contain these special characters.
" '
~ # $ % ^ & * ( ) _ - + [ ] > < \ | ; : / ?` and the space character.
Note: The characters !
, comma (,
), and period (.
) are not supported. Also, the tab (\t
) and newline (\n
) characters are not supported.
When the name contains special characters and is specified on the command line, the name must be enclosed in double quotes ("
). Double quotes are optional for names that do not contain special characters. For example, either use of quotes is valid on the command line "my#1schema".mytable
or "my#1schema"."mytable"
. Within the name, these special characters must be escaped with a backslash (\
) : "
$ ` .
When the name is specified in an input file, the name must not be enclosed in double quotes. Special characters do not require escaping.
Restoring from a Data Domain System with DD Boost
When you create a backup with gpcrondump
using DD Boost, the backup is stored on a Data Domain system storage unit. When restore the backup, you must use the same storage unit ID that was used when you backed up the data. You can use the gpcrondump
option --ddboost-show-config
to display the current DD Boost configuration information that includes the storage unit ID.
For information about using DD Boost and Data Domain systems with Greenplum Database, see “Backing Up and Restoring Databases” in the Greenplum Database Administrator Guide.
NetBackup is not compatible with DD Boost. Both NetBackup and DD Boost cannot be used in a single back up operation.
Restoring a Database from NetBackup
Greenplum Database must be configured to communicate with the Veritas NetBackup master server that is used to restore database data. See the Greenplum Database System Administrator Guide for information about configuring Greenplum Database and NetBackup.
When restoring from NetBackup server, you must specify the timestamp of the backup with the -t
option.
When restoring a large amount of data, set the NetBackup CLIENT_READ_TIMEOUT
option to a value that is at least twice the expected duration of the operation (in seconds). The CLIENT_READ_TIMEOUT
default value is 300
seconds (5 minutes).
NetBackup is not compatible with DD Boost. Both NetBackup and DD Boost cannot be used in a single back up operation.
Restoring a Database with Named Pipes
If you used named pipes when you backed up a database with gpcrondump
, named pipes with the backup data must be available when restoring the database from the backup.
Error Reporting
After a restore operation completes, the utility checks the gpdbrestore
status file for SQL execution errors and displays a warning if an error is found. The default location of the restore status files are in the db_dumps/date/
directory.
Be sure to examine the utility output messages and the restore status files when the operation completes. If gpdbrestore
output displays any warnings, or you notice errors in the restore status files, you must execute the restore process again.
Options
-a (do not prompt)
Do not prompt the user for confirmation.
-b YYYYMMDD
Looks for dump files in the segment data directories on the Greenplum Database array of hosts in db_dumps/
YYYYMMDD. If --ddboost
is specified, the systems looks for dump files on the Data Domain Boost host.
-B parallel_processes
The number of segments to check in parallel for pre/post-restore validation. If not specified, the utility will start up to 60 parallel processes depending on how many segment instances it needs to restore.
--change-schema=schema_name
Optional. Restores tables from a backup created with gpcrondump
to a different schema. The schema_name must exist in the database. If the schema does not exist, the utility returns an error. System catalog schemas are not supported.
You must specify tables to restore with the -T
and --table-file
options. If a table that is being restored exists in schema-name, the utility returns a warning and attempts to append the data to the table from the backup. You can specify the --trunctate
option to truncate table data before restoring data to the table from the backup.
This option is not supported if -S
is specified.
-d master_data_directory
Optional. The master host data directory. If not specified, the value set for $MASTER_DATA_DIRECTORY
will be used.
--ddboost [—ddboost-storage-unit=unit-ID ]
Use Data Domain Boost for this restore operation, if the --ddboost
option was specified when the data was backed up with the gpcrondump
utility. Before using Data Domain Boost, make sure the one-time Data Domain Boost credential setup is complete.
The --ddboost-storage-unit
option is optional. When restoring a backup from a Data Domain server, you must use the same storage unit that was used when you backed up the data.
See Restoring from a Data Domain System with DD Boost.
If you backed up Greenplum Database configuration files with the gpcrondump
option -g
and specified the --ddboost
option, you must manually restore the backup from the Data Domain system. The configuration files must be restored for the Greenplum Database master and all the hosts and segments. The backup location on the Data Domain system is the directory unit\_ID/``backup_directory``/
date. The backup_directory
is set when you specify the Data Domain credentials with gpcrondump
. The unit_ID is the Data Domain system storage unit that was used when you backed up the database.
This option is not supported if --netbackup-service-host
is specified.
-e (create target database before restore)
Creates the target database and then performs a restore operation. If the target database exists, drops the existing database before creating the database and performing a restore operation.
The utility restores database-level settings for the server configuration parameters gp_default_storage_options
, optimizer
, and search_path
.
-G [include | only] (restore global objects)
Restores database metadata information that is not associated with a specific schema or table, such as roles and tablespaces, if the global object dump file db_dumps/date/prefix\_string\_gp_global_-1_1_timestamp
is found in the master data directory. The global object file is created with the gpcrondump
option -G
.
- The keyword
include
restores global objects in addition to performing a restore. This is the default if no keyword is specified. - The keyword
only
restores only global objects. No other database objects or database table data are restored.
The -m
option restores metadata associated with schemas or tables.
-l logfile_directory
The directory to write the log file. Defaults to ~/gpAdminLogs
.
--list-backup
Lists the set of full and incremental backup sets required to perform a restore based on the timestamp_key specified with the -t
option and the location of the backup set.
This option is supported only if the timestamp_key is for an incremental backup.
-L (list tablenames in backup set)
When used with the -t
option, lists the table names that exist in the named backup set and exits. Does not perform a restore.
-m (restore metadata only)
Restores database metadata information such schema and table definitions and information created by SET
statements. This option does not restore database table data. All table and schema metadata is restored unless options are specified that include or exclude tables or schemas. If table or schema filters are specified, the utility restores the schema and table metadata only for the schemas and tables that are specified to be restored.
Database information that is not associated with a specific schema or table, such as roles and tablespaces, is not restored. You can specify the -G
option with this option to restore global metadata that was backed up with the gpcrondump
utility.
Database statistics are not restored. You can specify the --restore-stats
option to restore statistics that were backed up with the gpcrondump
utility.
Not supported with the --noplan
or --noanalyze
options.
--netbackup-block-size size
Specify the block size, in bytes, of data being transferred from the Veritas NetBackup server. The default is 512 bytes.
NetBackup options are not supported if DDBoost backup options are specified.
--netbackup-service-host netbackup_server
The NetBackup master server that Greenplum Database connects to when backing up to NetBackup. If you specify this option, you must specify the timestamp of the backup with the -t
option.
The maximum length for the this parameter is 127 characters.
This option is not supported with any of the these options: -R
, -s
, -b
, -L
, or --ddboost
.
NetBackup options are not supported if DDBoost backup options are specified.
--noanalyze
The ANALYZE
command is not run after a successful restore. The default is to run the ANALYZE
command on restored tables. This option is useful if running ANALYZE
on tables in your database requires a significant amount of time.
If this option is specified, you should run ANALYZE
manually on restored tables. Failure to run ANALYZE
following a restore might result in poor database performance.
Not supported with the -m
option.
--noplan
Restores only the data backed up during the incremental backup specified by the timestamp_key. No other data from the complete backup set are restored. The full backup set containing the incremental backup must be available.
The --noplan
option does not truncate existing tables prior to restoration. To avoid loading duplicate data into existing tables, you can truncate the tables before performing an incremental restore with the --noplan
option.
If the timestamp_key specified with the -t
option does not reference an incremental backup, an error is returned.
Not supported with the -m
option.
--prefix prefix_string
If you specified the gpcrondump
option --prefix prefix\_string
to create the backup, you must specify this option with the prefix_string when restoring the backup.
If you created a full backup of a set of tables with gpcrondump
and specified a prefix, you can use gpcrondump
with the options --list-filter-tables
and --prefix prefix\_string
to list the tables that were included or excluded for the backup.
-q (no screen output)
Run in quiet mode. Command output is not displayed on the screen, but is still written to the log file.
-R hostname:path_to_dumpset
Allows you to provide a hostname and full path to a set of dump files. The host does not have to be in the Greenplum Database array of hosts, but must be accessible from the Greenplum master.
--redirect database_name
Specify the name of the database where the data is restored. Specify this option to restore data to a database that is different than the database specified during back up. If database_name does not exist, it is created.
--report-status-dir report_directory
Specifies the absolute path to the directory on the each Greenplum Database host (master and segment hosts) where gpdbrestore
writes report status files for a restore operation. If report_directory does not exist or is not writable, gpdbrestore
returns an error and stops.
If this option is not specified and the -u
option is specified, report status files are written to the location specified by the -u
option if the -u
location is writable. If the location specified by -u
option is not writable, the report status files are written to segment data directories.
--restore-stats [include | only]
Specify this option to restore database statistics that were backed up with the gpcrondump
utility option --dump-stats
.
- The keyword
include
restores the statistics that were backed up in addition to performing a restore. This is the default if no keyword is specified. - The keyword
only
restores only the statistics that were backed up. No other database objects or database data are restored.
If this option is specified with other options that include or exclude tables or schemas to restore, the utility restores statistics only for the tables specified to be restored.
If statistics would be restored for a table that does not exist in the database, the utility displays a warning. The statistics are not restored.
-s database_name
Looks for latest set of dump files for the given database name in the segment data directories db_dumps
directory on the Greenplum Database array of hosts.
-S schema_name
Restore all the tables, views, indexes, functions, and sequences qualified by the specified schema from the backup. The -S
option can be specified multiple times. System catalog schemas are not supported. The schema name must exist in the backup set of the database being restored. To replace the data in the schema tables with the data from backup, you can specify the --truncate
option. The schema tables are truncated before the data is restored.
To support functions that might be restored, all procedural languages that are in the backup are also restored even though they are not schema specific. External items such as shared libraries that are used by a language are not included in a backup.
The -S
option cannot be specified with the --change-schema
option.
-t timestamp_key
The 14 digit timestamp key that uniquely identifies a backup set of data to restore. It is of the form YYYYMMDDHHMMSS
. Looks for dump files matching this timestamp key in the segment data directories db_dumps
directory on the Greenplum Database array of hosts.
-T schema.table_name
The name of a table to restore. The -T
option can be specified multiple times. The named table(s) must exist in the backup set of the database being restored. Existing tables are not automatically truncated before data is restored from backup. To replace existing data in the table from backup, you can specify the --truncate
option.
Wildcard characters are not supported.
--table-file file_name
Specify a file file_name that contains a list of table names to restore. The file contains any number of table names, listed one per line. See the -T
option for information about restoring specific tables.
--truncate
Truncate table data before restoring data to the table from the backup. If this option is not specified, existing table data is not removed before data is restored to the table.
This option is supported only when restoring a set of tables with the option -S
, -T
or --table-file
. If a table to be restored does not exist in the database, the table is restored and the utility returns a warning message stating that the table did not exist in the database.
This option is not supported with the -e
option.
-u backup_directory
Specifies the absolute path to the directory containing the db_dumps
directory on each host. If not specified, defaults to the data directory of each instance to be backed up. Specify this option if you specified a backup directory with the gpcrondump
option -u
when creating a backup set.
If backup_directory is not writable, backup operation report status files are written to segment data directories. You can specify a different location where report status files are written with the --report-status-dir option
.
Note: This option is not supported if --ddboost
is specified.
-v | —verbose
Specifies verbose mode.
--version (show utility version)
Displays the version of this utility.
-? (help)
Displays the online help.
Examples
Restore the sales
database from the latest backup files generated by gpcrondump
(assumes backup files are in the segment data directories in db_dumps
):
gpdbrestore -s sales
Restore a database from backup files that reside on an archive host outside the Greenplum Database array (command issued on the Greenplum master host):
gpdbrestore -R archivehostname:/data_p1/db_dumps/20160214
Restore global objects only (roles and tablespaces):
gpdbrestore -G
Note: The -R
option is not supported when restoring a backup set that includes incremental backups.
If you restore from a backup set that contains an incremental backup, all the files in the backup set must be available to gpdbrestore
. For example, the following timestamp keys specify a backup set. 20160514054532 is the full backup and the others are incremental.
20160514054532
20160714095512
20150914081205
20161114064330
20170114051246
The following gbdbrestore
command specifies the timestamp key 20161114064330. The incremental backup with the timestamps 20160714095512 and 20160914081205 and the full backup must be available to perform a restore.
gpdbrestore -t 20161114064330
The following gbdbrestore
command uses the --noplan
option to restore only the data that was backed up during the incremental backup with the timestamp key 20161114064330. Data in the previous incremental backups and the data in the full backup are not restored. The --noplan
option does not truncate existing tables prior to restoration. To avoid loading duplicate data into existing tables, you can truncate the tables before performing an incremental restore with the --noplan
option.
gpdbrestore -t 20161114064330 --noplan
This gpdbrestore
command restores Greenplum Database data from the data managed by NetBackup master server nbu_server1
. The option -t 20170530090000
specifies the timestamp generated by gpcrondump
when the backup was created. The -e
option specifies that the target database is dropped before it is restored.
gpdbrestore -t 20170530090000 -e --netbackup-service-host=nbu_server1