Migrate from Postgres

Migrate from Postgres

Warning:

This is a beta feature. It is currently undergoing continued testing. Please file a Github issue with us if you identify a bug.

This page has instructions for migrating data from Postgres to CockroachDB using IMPORT's support for reading pg_dump files.

The examples below pull real data from Amazon S3. They use the employees data set that is also used in the MySQL docs. The data was imported to Postgres using pgloader, and then modified for use here as explained below.

Step 1. Dump the Postgres database

There are several ways to dump data from Postgres to be imported into CockroachDB:

Dump the entire database
Dump one table at a time
The import will fail if the dump file contains functions or type definitions. In addition to calling pg_dump as shown below, you may need to edit the dump file to remove functions and data types.

Also, note that CockroachDB's IMPORT does not support automatically importing data from Postgres' non-public schemas. As a workaround, you can edit the dump file to change the table and schema names in the CREATE TABLE statements.

Dump the entire database

Most users will want to import their entire Postgres database all at once, as shown below in Import a full database dump.

To dump the entire database, run the pg_dump command shown below.

$ pg_dump employees > /tmp/employees-full.sql

For this data set, the Postgres dump file required the following edits, which have already been performed on the files used in the examples below:

The type of the employees.gender column in the CREATE TABLE statement had to be changed from employees.employees_gender to STRING since Postgres represented the employee's gender using a CREATE TYPE statement that is not supported by CockroachDB.
A CREATE TYPE employee … statement needed to be removed.

If you only want to import one table from a database dump, see Import a table from a full database dump below.

Dump one table at a time

To dump the employees table from a Postgres database also named employees, run the pg_dump command shown below. You can import this table using the instructions in Import a table from a table dump below.

$ pg_dump -t employees  employees > /tmp/employees.sql

For this data set, the Postgres dump file required the following edits, which have already been performed on the files used in the examples below.

The type of the employees.gender column in the CREATE TABLE statement had to be changed from employees.employees_gender to STRING since Postgres represented the employee's gender using a CREATE TYPE statement that is not supported by CockroachDB.

Step 2. Host the files where the cluster can access them

Each node in the CockroachDB cluster needs to have access to the files being imported. There are several ways for the cluster to access the data; for a complete list of the types of storage IMPORT can pull from, see Import File URLs.

Tip:

We strongly recommend using cloud storage such as Amazon S3 or Google Cloud to host the data files you want to import.

Step 3. Import the Postgres dump file

You can choose from several variants of the IMPORT statement, depending on whether you want to import a full database or a single table:

Import a full database dump
Import a table from a full database dump
Import a table from a table dump
Note that all of the IMPORT statements in this section pull real data from Amazon S3 and will kick off background import jobs that you can monitor with SHOW JOBS.

Import a full database dump

This example assumes you dumped the entire database.

The IMPORT statement below reads the data and DDL statements (including existing foreign key relationships) from the full database dump file.

> IMPORT PGDUMP 'https://s3-us-west-1.amazonaws.com/cockroachdb-movr/datasets/employees-db/pg_dump/employees-full.sql.gz';

       job_id       |  status   | fraction_completed |  rows  | index_entries | system_records |  bytes
--------------------+-----------+--------------------+--------+---------------+----------------+----------
 381845110403104769 | succeeded |                  1 | 300024 |             0 |              0 | 11534293
(1 row)

Import a table from a full database dump

This example assumes you dumped the entire database.

IMPORT can import one table's data from a full database dump. It reads the data and applies any CREATE TABLE statements from the file.

> CREATE DATABASE IF NOT EXISTS employees;
> USE employees;
> IMPORT TABLE employees FROM PGDUMP 'https://s3-us-west-1.amazonaws.com/cockroachdb-movr/datasets/employees-db/pg_dump/employees-full.sql.gz';

       job_id       |  status   | fraction_completed |  rows  | index_entries | system_records |  bytes
--------------------+-----------+--------------------+--------+---------------+----------------+----------
 383839294913871873 | succeeded |                  1 | 300024 |             0 |              0 | 11534293
(1 row)

Import a table from a table dump

The examples below assume you dumped one table.

The simplest way to import a table dump is to run IMPORT TABLE as shown below. It reads the table data and any CREATE TABLE statements from the file.

> CREATE DATABASE IF NOT EXISTS employees;
> USE employees;
> IMPORT PGDUMP 'https://s3-us-west-1.amazonaws.com/cockroachdb-movr/datasets/employees-db/pg_dump/employees.sql.gz';

       job_id       |  status   | fraction_completed |  rows  | index_entries | system_records |  bytes   
--------------------+-----------+--------------------+--------+---------------+----------------+----------
 383855569817436161 | succeeded |                  1 | 300024 |             0 |              0 | 11534293
(1 row)

If you need to specify the table's columns for some reason, you can use an IMPORT TABLE statement like the one below, which will import data but ignore any CREATE TABLE statements in the file, instead relying on the columns you specify.

> IMPORT TABLE employees (
    emp_no INT PRIMARY KEY,
    birth_date DATE NOT NULL,
    first_name STRING NOT NULL,
    last_name STRING NOT NULL,
    gender STRING NOT NULL,
    hire_date DATE NOT NULL
  )
  PGDUMP DATA ('https://s3-us-west-1.amazonaws.com/cockroachdb-movr/datasets/employees-db/pg_dump/employees.sql.gz');

Configuration Options

The following options are available to IMPORT … PGDUMP:

Max row size

The max_row_size option is used to override limits on line size. Default: 0.5MB. This setting may need to be tweaked if your Postgres dump file has extremely long lines, for example as part of a COPY statement.

Example usage:

> IMPORT TABLE employees (
    emp_no INT PRIMARY KEY,
    birth_date DATE NOT NULL,
    first_name STRING NOT NULL,
    last_name STRING NOT NULL,
    gender STRING NOT NULL,
    hire_date DATE NOT NULL
  )
  PGDUMP DATA ('s3://your-external-storage/employees.sql?AWS_ACCESS_KEY_ID=123&AWS_SECRET_ACCESS_KEY=456') WITH max_row_size = '5MB';

Skip foreign keys

By default, IMPORT … PGDUMP supports foreign keys. Default: false. Add the skip_foreign_keys option to speed up data import by ignoring foreign key constraints in the dump file's DDL. It will also enable you to import individual tables that would otherwise fail due to dependencies on other tables.

Note:

The most common dependency issues are caused by unsatisfied foreign key relationships. You can avoid these issues by adding the skip_foreign_keys option to your IMPORT statement as needed. For more information, see the list of import options.

For example, if you get the error message pq: there is no unique constraint matching given keys for referenced table tablename, use IMPORT … WITH skip_foreign_keys.

Example usage:

> IMPORT TABLE employees (
    emp_no INTEGER PRIMARY KEY,
    birth_date DATE NOT NULL,
    first_name STRING NOT NULL,
    last_name STRING NOT NULL,
    gender STRING NOT NULL,
    hire_date DATE NOT NULL
  ) PGDUMP DATA ('s3://your-external-storage/employees.sql?AWS_ACCESS_KEY_ID=123&AWS_SECRET_ACCESS_KEY=456') WITH skip_foreign_keys;

Foreign key constraints can be added by using ALTER TABLE … ADD CONSTRAINT commands after importing the data.

Migrate from Postgres

Migrate from Postgres

Step 1. Dump the Postgres database

Dump the entire database

Dump one table at a time

Step 2. Host the files where the cluster can access them

Step 3. Import the Postgres dump file

Import a full database dump

Import a table from a full database dump

Import a table from a table dump

Configuration Options

Max row size

Skip foreign keys

See also