MySQL Load
SinceVersion dev
This is an stand syntax of MySQL LOAD DATA for user to load local file.
MySQL load synchronously executes the import and returns the import result. The return information will show whether the import is successful for user.
MySQL load is mainly suitable for importing local files on the client side, or importing data from a data stream through a program.
Basic Principles
The MySQL Load are similar with Stream Load. Both import local files into the Doris cluster, so the MySQL Load will reuses StreamLoad:
FE receives the MySQL Load request executed by the client and then analyse the SQL
FE build the MySQL Load request as a StreamLoad request.
FE selects a BE node to send a StreamLoad request
When sending the request, FE will read the local file data from the MySQL client side streamingly, and send it to the HTTP request of StreamLoad asynchronously.
After the data transfer on the MySQL client side is completed, FE waits for the StreamLoad to complete, and displays the import success or failure information to the client side.
Support data format
MySQL Load currently only supports data formats: CSV (text).
Basic operations
client connection
mysql --local-infile -h 127.0.0.1 -P 9030 -u root -D testdb
Notice that if you wants to use MySQL load, you must connect doris server with --local-infile
in client command. If you’re use jdbc to connect doris, you must add property named allowLoadLocalInfile=true
in jdbc url.
Create test table
CREATE TABLE testdb.t1 (pk INT, v1 INT SUM) AGGREGATE KEY (pk) DISTRIBUTED BY hash (pk) PROPERTIES ('replication_num' = '1');
import file from client node
Suppose there is a CSV file named ‘client_local.csv ‘on the current path of the client side, which will be imported into the test table’testdb.t1’ using the MySQL Load syntax.
LOAD DATA LOCAL
INFILE 'client_local.csv '
INTO TABLE testdb.t1
PARTITION (partition_a, partition_b, partition_c, partition_d)
COLUMNS TERMINATED BY '\ t'
LINES TERMINATED BY '\ n'
IGNORE 1 LINES
(K1, k2, v2, v10, v11)
SET (c1 = k1, c2 = k2, c3 = v10, c4 = v11)
PROPERTIES ("strict_mode" = "true")
- MySQL Load starts with the syntax
LOAD DATA
, and specifyingLOCAL
means reading client side files. - The local fill path will be filled after
INFILE
, which can be a relative path or an absolute path. Currently only a single file is supported, and multiple files are not supported - The table name after
INTO TABLE
can specify the database name, as shown in the case. It can also be omitted, and the current database for the user will be used. PARTITION
syntax supports specified partition to importCOLUMNS TERMINATED BY
specifies the column separatorLINES TERMINATED BY
specifies the line separatorIGNORE num LINES
skips the num header of the CSV.- Column mapping syntax, see the column mapping chapter of Imported Data Transformation for specific parameters
PROPERTIES
is the configuration of import, please refer to the MySQL Load command manual for specific properties.
import file from fe server node
Assuming that the ‘/root/server_local.csv’ path on the FE node is a CSV file, use the MySQL client side to connect to the corresponding FE node, and then execute the following command to import data into the test table.
LOAD DATA
INFILE '/root/server_local.csv'
INTO TABLE testdb.t1
PARTITION (partition_a, partition_b, partition_c, partition_d)
COLUMNS TERMINATED BY '\ t'
LINES TERMINATED BY '\ n'
IGNORE 1 LINES
(K1, k2, v2, v10, v11)
SET (c1 = k1, c2 = k2, c3 = v10, c4 = v11)
PROPERTIES ("strict_mode" = "true")
- The only difference between the syntax of importing server level local files and importing client side syntax is whether the’LOCAL ‘keyword is added after the’LOAD DATA’ keyword.
- FE will have multi-nodes, and importing server level files can only import FE nodes connected by the client side, and cannot import files local to other FE nodes.
- Server side load was disabled by default. Enable it by setting
mysql_load_server_secure_path
with a secure path. All the load file should be under this path.
Return result
Since MySQL Load is a synchronous import method, the imported results are returned to the user through SQL syntax. If the import fails, a specific error message will be displayed. If the import is successful, the number of imported rows will be displayed.
Query OK, 1 row affected (0.17 sec)
Records: 1 Deleted: 0 Skipped: 0 Warnings: 0
Error result
If MySQL Load process goes wrong, it will show the error in the client as below:
ERROR 1105 (HY000): errCode = 2, detailMessage = [DATA_QUALITY_ERROR]too many filtered rows with load id b612907c-ccf4-4ac2-82fe-107ece655f0f
If you meets this error, you can extract the loadId
and use it in the show load warnings
command to get more detail message.
show load warnings where label='b612907c-ccf4-4ac2-82fe-107ece655f0f';
The loadId was the label in this case.
Configuration
mysql_load_thread_pool
: the thread pool size for singe FE node, set 4 thread by default. The block queue size is 5 times ofmysql_load_thread_pool
. So FE can accept 4 + 4*5 = 24 requests in one time. Increase this configuration if the parallelism are larger than 24.mysql_load_server_secure_path
: the secure path for load data from server. Empty path by default means that it’s not allowed for server load. Recommend to create alocal_import_data
directory underDORIS_HOME
to load data if you want enable it.mysql_load_in_memory_record
The failed MySQL Load record size. The record was keep in memory and only have 20 records by default. If you want to track more records, you can rise the config but be careful about the fe memory. This record will expired after one day and there is a async thread to clean it in every day.
Notice
- If you see this
LOAD DATA LOCAL INFILE file request rejected due to restrictions on access
message, you should connet MySQL withmysql --local-infile=1
command to enable client to load local file. - The configuration for StreamLoad will also affect MySQL Load. Such as the configurate in be named
streaming_load_max_mb
, it’s 10GB by default and it will control the max size for one load.
More Help
- For more detailed syntax and best practices for using MySQL Load, see the MySQL Load command manual.