Migrate data from MySQL to MatrixOne
This document will guide you on how to migrate data from MySQL to MatrixOne.
MatrixOne maintains a high degree of compatibility with MySQL syntax, so the syntax adjustments that need to be made during the migration process are minimal. Nevertheless, there are still some differences between MatrixOne and MySQL in terms of data types and data objects, which requires some adjustments in the migration process.
Data type difference
In MatrixOne, while maintaining the same name as MySQL, there are slight differences in accuracy and range between data types and MySQL. For more information, see Data Types.
Online Migration
This chapter will guide you to use third-party tools to migrate data from MySQL to MatrixOne.
- Applicable scenarios: scenarios where the amount of data is small (recommended less than 1GB), and the migration speed is not sensitive.
Preparation
- Springboard machine with a graphical interface: it can connect to the source of MySQL and the target of MatrixOne.
- Data Migration Tool: Download DBeaver on the springboard machine.
Step 1: Migrate table structure
Here we take the TPCH dataset as an example and migrate the 8 tables of the TPCH dataset from MySQL to MatrixOne.
Open DBeaver, select the table to be migrated from MySQL, right-click and select Generate SQL > DDL Click Copy, first copy this SQL to a text editor for text editing Name the filer as tpch_ddl.sql and save it locally on the springboard machine.
Use the following command to replace keywords not supported by MatrixOne in the tpch_ddl.sql file:
# The commands executed by the Linux system are as follows:
sed -i 's/ ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci//g' /YOUR_PATH/tpch_ddl.sql
# The commands executed by the MacOS system are as follows:
sed -i '' 's/ ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci//g' /YOUR_PATH/tpch_ddl.sql
Connect to MatrixOne and create a new database and table in MatrixOne:
create database tpch;
use tpch;
source '/YOUR_PATH/tpch_ddl.sql'
Step 2: Migrate data
Open DBeaver, select the table to be migrated from MySQL, right-click and select Export Data:
In the Conversion Target > Export Target window, select Database, click Next; in the Table Mapping window, select Target Container, and select the MatrixOne database for the target container tpch:
In the Extraction Settings and Data Loading Settings windows, set the number of selected extractions and inserts. To trigger MatrixOne’s direct write S3 strategy, it is recommended to fill in 5000:
After completing the settings, DBeaver starts to migrate the data, and after completion, DBeaver will prompt that the migration is successful.
Step 3: Check the data
After the migration is complete, the data can be inspected as follows:
Use
select count(*) from <table_name>
to confirm whether the data volume of the source database and target databases’ data volume is consistent.Compare the results through related queries; you can also refer to the Complete TPCH testing query example to compare the results.
Offline Migration
This chapter will guide you through importing to MatrixOne through offline files.
- Applicable scenarios: scenarios with a large amount of data (more significant than 1GB) and sensitive to migration speed.
Preparation
- Springboard machine with a graphical interface: it can be connected to the source end of MySQL and the target end of MatrixOne.
- Data Migration Tool: Download DBeaver to the springboard machine.
- Install
mysqldump
. If you are not familiar with how to usemysqldump
, see mysqldump tutorial
Step 1: Migrate table structure
Here we take the TPCH dataset as an example and migrate the 8 tables of the TPCH dataset from MySQL to MatrixOne.
Open DBeaver, select the table to be migrated from MySQL, right-click and select Generate SQL > DDL > Copy, first copy this SQL to a text editor, and name the text editor tpch_ddl.sql, saved locally on the springboard machine.
Use the following command to replace keywords that MatrixOne does not support in the tpch_ddl.sql file:
# The commands executed by the Linux system are as follows:
$ sed -i 's/ ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci//g' /YOUR_PATH/tpch_ddl.sql
# The commands executed by the MacOS system are as follows:
sed -i '' 's/ ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci//g' /YOUR_PATH/tpch_ddl.sql
Connect to MatrixOne and create a new database and table in MatrixOne:
create database tpch;
use tpch;
source '/YOUR_PATH/tpch_ddl.sql'
Step 2: Migrate data
MatrixOne has two data migration methods to choose from: INSERT
and LOAD DATA
. When the amount of data is greater than 1GB, it is recommended to use LOAD DATA
first, followed by INSERT
.
LOAD DATA
Use LOAD DATA
to export the MySQL data table to CSV format first, and use MatrixOne’s parallel LOAD function to migrate the data to MatrixOne:
Use
mysqldump
to export the MySQL data table as a CSV format file. Make sure you have write access to the file path and check thesecure_file_priv
configuration:mysqldump -u root -p -t -T /{filepath} tpch --fields-terminated-by='|'
Connect to MatrixOne and import the exported CSV data into MatrixOne:
mysql> load data infile '/{filepath}/lineitem.txt' INTO TABLE lineitem FIELDS TERMINATED BY '|' lines TERMINATED BY '\n' parallel 'true';
mysql> load data infile '/{filepath}/nation.txt' INTO TABLE nation FIELDS TERMINATED BY '|' lines TERMINATED BY '\n' parallel 'true';
mysql> load data infile '/{filepath}/part.txt' INTO TABLE part FIELDS TERMINATED BY '|' lines TERMINATED BY '\n' parallel 'true';
mysql> load data infile '/{filepath}/customer.txt' INTO TABLE customer FIELDS TERMINATED BY '|' lines TERMINATED BY '\n' parallel 'true';
mysql> load data infile '/{filepath}/orders.txt' INTO TABLE orders FIELDS TERMINATED BY '|' lines TERMINATED BY '\n' parallel 'true';
mysql> load data infile '/{filepath}/supplier.txt' INTO TABLE supplier FIELDS TERMINATED BY '|' lines TERMINATED BY '\n' parallel 'true';
mysql> load data infile '/{filepath}/region.txt' INTO TABLE region FIELDS TERMINATED BY '|' lines TERMINATED BY '\n' parallel 'true';
mysql> load data infile '/{filepath}/partsupp.txt' INTO TABLE partsupp FIELDS TERMINATED BY '|' lines TERMINATED BY '\n' parallel 'true';
For more operation examples of LOAD DATA
, see Bulk Load Overview.
INSERT
The INSERT
statement needs to use mysqldump
to export the logical statement first and then import it into MatrixOne:
Use
mysqldump
to export data. To ensure that MatrixOne directly writes to S3 when inserting, inserting as large a batch as possible is recommended. Thenet_buffer_length
parameter should start at 10MB:mysqldump -t tpch -uroot -p --net_buffer_length=10m > tpch_data.sql
On the MatrixOne side, execute the SQL file, there will be an error message during the process, but it will not affect the data insertion:
source '/YOUR_PATH/tpch_data.sql'
For more examples of INSERT
operations, see Insert Data.
Step 3: Check the data
After the migration is complete, the data can be inspected as follows:
Use
select count(*) from <table_name>
to confirm whether the data volume of the source database and target databases’ data volume is consistent.Compare the results through related queries; you can also refer to the Complete a TPCH Test with MatrixOne query example to compare the results.
Reference example
If you are a novice and want to migrate a small amount of data, see Import data by using the source command.