A transaction is an operation that contains one or more SQL statements. The execution of these statements must either be completely successful or completely fail. It is an indivisible work unit.

Explicit and Implicit Transactions

Explicit Transactions

Explicit transactions require users to actively start, commit, or roll back transactions. Doris provides two types of explicit transactions:

  1. The transaction write method introduced in this document :

    1. BEGIN;
    2. [INSERT, UPDATE, DELETE statement]
    3. COMMIT; / ROLLBACK;
  2. Stream Load 2PC

Implicit Transactions

Implicit transactions refer to SQL statements that are executed without explicitly adding statements to start and commit transactions before and after the statements.

In Doris, except for Group Commit, each import statement opens a transaction when it starts executing. The transaction is automatically committed after the statement is executed, or automatically rolled back if the statement fails. For more information, see Transaction Load.

Transaction Operations

Start a Transaction

  1. BEGIN;
  2. BEGIN WITH LABEL {user_label};

If this statement is executed while the current session is in the middle of a transaction, Doris will ignore the statement, which can also be understood as transactions cannot be nested.

Commit a Transaction

  1. COMMIT;

Used to commit all modifications made in the current transaction.

Rollback a Transaction

  1. ROLLBACK;

Used to roll back all modifications made in the current transaction.

Transactions are session-level, so if a session is terminated or closed, the transaction will automatically be rolled back.

Transaction Load

Currently, Doris supports two ways of transaction loading.

Multiple INSERT INTO VALUES for one table

Suppose the table schema is:

  1. CREATE TABLE `dt` (
  2. `id` INT(11) NOT NULL,
  3. `name` VARCHAR(50) NULL,
  4. `score` INT(11) NULL
  5. ) ENGINE=OLAP
  6. UNIQUE KEY(`id`)
  7. DISTRIBUTED BY HASH(`id`) BUCKETS 1
  8. PROPERTIES (
  9. "replication_num" = "1"
  10. );

Do transaction load:

  1. mysql> BEGIN;
  2. Query OK, 0 rows affected (0.01 sec)
  3. {'label':'txn_insert_b55db21aad7451b-b5b6c339704920c5', 'status':'PREPARE', 'txnId':''}
  4. mysql> INSERT INTO dt (id, name, score) VALUES (1, "Emily", 25), (2, "Benjamin", 35), (3, "Olivia", 28), (4, "Alexander", 60), (5, "Ava", 17);
  5. Query OK, 5 rows affected (0.08 sec)
  6. {'label':'txn_insert_b55db21aad7451b-b5b6c339704920c5', 'status':'PREPARE', 'txnId':'10013'}
  7. mysql> INSERT INTO dt VALUES (6, "William", 69), (7, "Sophia", 32), (8, "James", 64), (9, "Emma", 37), (10, "Liam", 64);
  8. Query OK, 5 rows affected (0.00 sec)
  9. {'label':'txn_insert_b55db21aad7451b-b5b6c339704920c5', 'status':'PREPARE', 'txnId':'10013'}
  10. mysql> COMMIT;
  11. Query OK, 0 rows affected (1.02 sec)
  12. {'label':'txn_insert_b55db21aad7451b-b5b6c339704920c5', 'status':'VISIBLE', 'txnId':'10013'}

This method not only achieves atomicity, but also in Doris, it enhances the writing performance of INSERT INTO VALUES.

If user enables Group Commit and transaction insert at the same time, the transaction insert will work.

See Insert Into for more details.

Multiple INSERT INTO SELECT, UPDATE, DELETE for multiple tables

Suppose there are 3 tables: dt1, dt2, dt3, with the same schema as above, and the data in the tables are:

  1. mysql> SELECT * FROM dt1;
  2. +------+-----------+-------+
  3. | id | name | score |
  4. +------+-----------+-------+
  5. | 1 | Emily | 25 |
  6. | 2 | Benjamin | 35 |
  7. | 3 | Olivia | 28 |
  8. | 4 | Alexander | 60 |
  9. | 5 | Ava | 17 |
  10. +------+-----------+-------+
  11. 5 rows in set (0.04 sec)
  12. mysql> SELECT * FROM dt2;
  13. +------+---------+-------+
  14. | id | name | score |
  15. +------+---------+-------+
  16. | 6 | William | 69 |
  17. | 7 | Sophia | 32 |
  18. | 8 | James | 64 |
  19. | 9 | Emma | 37 |
  20. | 10 | Liam | 64 |
  21. +------+---------+-------+
  22. 5 rows in set (0.03 sec)
  23. mysql> SELECT * FROM dt3;
  24. Empty set (0.03 sec)

Do transaction load, write the data from dt1 and dt2 to dt3, and update the scores in dt1 and delete the data in dt2:

  1. mysql> BEGIN;
  2. Query OK, 0 rows affected (0.00 sec)
  3. {'label':'txn_insert_442a6311f6c541ae-b57d7f00fa5db028', 'status':'PREPARE', 'txnId':''}
  4. # 导入任务的状态是 PREPARE
  5. mysql> INSERT INTO dt3 SELECT * FROM dt1;
  6. Query OK, 5 rows affected (0.07 sec)
  7. {'label':'txn_insert_442a6311f6c541ae-b57d7f00fa5db028', 'status':'PREPARE', 'txnId':'11024'}
  8. mysql> INSERT INTO dt3 SELECT * FROM dt2;
  9. Query OK, 5 rows affected (0.08 sec)
  10. {'label':'txn_insert_442a6311f6c541ae-b57d7f00fa5db028', 'status':'PREPARE', 'txnId':'11025'}
  11. mysql> UPDATE dt1 SET score = score + 10 WHERE id >= 4;
  12. Query OK, 2 rows affected (0.07 sec)
  13. {'label':'txn_insert_442a6311f6c541ae-b57d7f00fa5db028', 'status':'PREPARE', 'txnId':'11026'}
  14. mysql> DELETE FROM dt2 WHERE id >= 9;
  15. Query OK, 0 rows affected (0.01 sec)
  16. {'label':'txn_insert_442a6311f6c541ae-b57d7f00fa5db028', 'status':'PREPARE', 'txnId':'11027'}
  17. mysql> COMMIT;
  18. Query OK, 0 rows affected (0.03 sec)
  19. {'label':'txn_insert_442a6311f6c541ae-b57d7f00fa5db028', 'status':'VISIBLE', 'txnId':'11024'}

Select data:

  1. # the score column of id >= 4 records is updated
  2. mysql> SELECT * FROM dt1;
  3. +------+-----------+-------+
  4. | id | name | score |
  5. +------+-----------+-------+
  6. | 1 | Emily | 25 |
  7. | 2 | Benjamin | 35 |
  8. | 3 | Olivia | 28 |
  9. | 4 | Alexander | 70 |
  10. | 5 | Ava | 27 |
  11. +------+-----------+-------+
  12. 5 rows in set (0.01 sec)
  13. # the records of id >= 9 are deleted
  14. mysql> SELECT * FROM dt2;
  15. +------+---------+-------+
  16. | id | name | score |
  17. +------+---------+-------+
  18. | 6 | William | 69 |
  19. | 7 | Sophia | 32 |
  20. | 8 | James | 64 |
  21. +------+---------+-------+
  22. 3 rows in set (0.02 sec)
  23. # the data of dt1 and dt2 is written to dt3
  24. mysql> SELECT * FROM dt3;
  25. +------+-----------+-------+
  26. | id | name | score |
  27. +------+-----------+-------+
  28. | 1 | Emily | 25 |
  29. | 2 | Benjamin | 35 |
  30. | 3 | Olivia | 28 |
  31. | 4 | Alexander | 60 |
  32. | 5 | Ava | 17 |
  33. | 6 | William | 69 |
  34. | 7 | Sophia | 32 |
  35. | 8 | James | 64 |
  36. | 9 | Emma | 37 |
  37. | 10 | Liam | 64 |
  38. +------+-----------+-------+
  39. 10 rows in set (0.01 sec)

Isolation Level

Doris provides the READ COMMITTED isolation level. Please note the following:

  • In a transaction, each statement reads the data that was committed at the time the statement began executing:

    1. timestamp | ------------ Session 1 ------------ | ------------ Session 2 ------------
    2. t1 | BEGIN; |
    3. t2 | # read n rows from dt1 table |
    4. | INSERT INTO dt3 SELECT * FROM dt1; |
    5. t3 | | # write 2 rows to dt1 table
    6. | | INSERT INTO dt1 VALUES(...), (...);
    7. t4 | # read n + 2 rows FROM dt1 table |
    8. | INSERT INTO dt3 SELECT * FROM dt1; |
    9. t5 | COMMIT; |
  • In a transaction, each statement cannot read the modifications made by other statements within the same transactio:

    Suppose dt1 has 5 rows, dt2 has 5 rows, dt3 has 0 rows. And execute the following SQL:

    1. BEGIN;
    2. # write 5 rows to dt2,
    3. INSERT INTO dt2 SELECT * FROM dt1;
    4. # write 5 rows to dt3, and cannot read the new data written to dt2 in the previous step
    5. INSERT INTO dt3 SELECT * FROM dt2;
    6. COMMIT;

    One example:

    1. # create table and insert data
    2. CREATE TABLE `dt1` (
    3. `id` INT(11) NOT NULL,
    4. `name` VARCHAR(50) NULL,
    5. `score` INT(11) NULL
    6. ) ENGINE=OLAP
    7. DUPLICATE KEY(`id`)
    8. DISTRIBUTED BY HASH(`id`) BUCKETS 1
    9. PROPERTIES (
    10. "replication_num" = "1"
    11. );
    12. CREATE TABLE dt2 LIKE dt1;
    13. CREATE TABLE dt3 LIKE dt1;
    14. INSERT INTO dt1 VALUES (1, "Emily", 25), (2, "Benjamin", 35), (3, "Olivia", 28), (4, "Alexander", 60), (5, "Ava", 17);
    15. INSERT INTO dt2 VALUES (6, "William", 69), (7, "Sophia", 32), (8, "James", 64), (9, "Emma", 37), (10, "Liam", 64);
    16. # Do transaction write
    17. BEGIN;
    18. INSERT INTO dt2 SELECT * FROM dt1;
    19. INSERT INTO dt3 SELECT * FROM dt2;
    20. COMMIT;
    21. # Select data
    22. mysql> SELECT * FROM dt2;
    23. +------+-----------+-------+
    24. | id | name | score |
    25. +------+-----------+-------+
    26. | 6 | William | 69 |
    27. | 7 | Sophia | 32 |
    28. | 8 | James | 64 |
    29. | 9 | Emma | 37 |
    30. | 10 | Liam | 64 |
    31. | 1 | Emily | 25 |
    32. | 2 | Benjamin | 35 |
    33. | 3 | Olivia | 28 |
    34. | 4 | Alexander | 60 |
    35. | 5 | Ava | 17 |
    36. +------+-----------+-------+
    37. 10 rows in set (0.01 sec)
    38. mysql> SELECT * FROM dt3;
    39. +------+---------+-------+
    40. | id | name | score |
    41. +------+---------+-------+
    42. | 6 | William | 69 |
    43. | 7 | Sophia | 32 |
    44. | 8 | James | 64 |
    45. | 9 | Emma | 37 |
    46. | 10 | Liam | 64 |
    47. +------+---------+-------+
    48. 5 rows in set (0.01 sec)

Failed Statements Within a Transaction

When a statement within a transaction fails, that operation is rolled back. However, other statements within the transaction that have executed successfully are still able to either commit or rollback. Once the transaction is successfully committed, the modifications made by the successfully executed statements within the transaction are applied.

One example:

  1. mysql> BEGIN;
  2. Query OK, 0 rows affected (0.00 sec)
  3. {'label':'txn_insert_c5940d31bf364f57-a48b628886415442', 'status':'PREPARE', 'txnId':''}
  4. mysql> INSERT INTO dt3 SELECT * FROM dt1;
  5. Query OK, 5 rows affected (0.07 sec)
  6. {'label':'txn_insert_c5940d31bf364f57-a48b628886415442', 'status':'PREPARE', 'txnId':'11058'}
  7. # The failed insert is rolled back
  8. mysql> INSERT INTO dt3 SELECT * FROM dt2;
  9. ERROR 5025 (HY000): Insert has filtered data in strict mode, tracking_url=http://172.21.16.12:9082/api/_load_error_log?file=__shard_3/error_log_insert_stmt_3d1fed266ce443f2-b54d2609c2ea6b11_3d1fed266ce443f2_b54d2609c2ea6b11
  10. mysql> INSERT INTO dt3 SELECT * FROM dt2 WHERE id = 7;
  11. Query OK, 0 rows affected (0.07 sec)
  12. mysql> COMMIT;
  13. Query OK, 0 rows affected (0.02 sec)
  14. {'label':'txn_insert_c5940d31bf364f57-a48b628886415442', 'status':'VISIBLE', 'txnId':'11058'}

Select data:

  1. # The data in dt1 is written to dt3, the data with id = 7 in dt2 is written successfully, and the other data is written failed
  2. mysql> SELECT * FROM dt3;
  3. +------+----------+-------+
  4. | id | name | score |
  5. +------+----------+-------+
  6. | 1 | Emily | 25 |
  7. | 2 | Benjamin | 35 |
  8. | 3 | Olivia | 28 |
  9. | 4 | Alexande | 60 |
  10. | 5 | Ava | 17 |
  11. | 7 | Sophia | 32 |
  12. +------+----------+-------+
  13. 6 rows in set (0.01 sec)

QA

  • Writing to multiple tables must belong to the same Database; otherwise, you will encounter the error Transaction insert must be in the same database

  • Mixing the two transaction load of INSERT INTO SELECT, UPDATE, DELETE and INSERT INTO VALUES is not allowed; otherwise, you will encounter the error Transaction insert can not insert into values and insert into select at the same time.

  • Delete Command supports delete by specifying a filter predicate or using clause, to guarantee the isolation, currently only support that, the delete operations must before the insert operations for one table in one transaction, otherwise, you will encounter the error Can not delete because there is a insert operation for the same table.

  • If the time-consuming from BEGIN statement exceeds the timeout configured in Doris, the transaction will be rolled back. Currently, the timeout uses the maximum value of session variables insert_timeout and query_timeout.

  • When using JDBC to connect to Doris for transaction operations, please add useLocalSessionState=true in the JDBC URL; otherwise, you may encounter the error This is in a transaction, only insert, update, delete, commit, rollback is acceptable.

  • In cloud mode, transaction load does not support merge on write unique tables, otherwise, you will encounter the error Transaction load is not supported for merge on write unique keys table in cloud mode.