PolarDB MySQL 冷数据DDL优化

Author: yifei

过去,PolarDB 在处理归档的冷数据时,遇到的主要挑战之一是对数据定义语言(DDL)操作的兼容性和效率问题,尤其是在执行诸如增加列、调整列宽等操作时。由于这些操作通常要求数据重写,通过 COPY 机制来实现,导致了极高的资源消耗和执行成本。这一局限性在持续归档至 OSS 外表的场景中尤为突出,给数据管理和运维工作带来了不便。 为应对这一挑战,我们引入了 OSS META 功能。该功能通过增强对冷数据的元数据管理能力,使得特定的DDL操作能够以即时(INSTANT)模式执行,从根本上优化了冷数据处理的灵活性和效率。这意味着,在 OSS META 功能启用状态下,系统能够智能识别并应用DDL变更,而无需经历传统意义上的全量数据重写流程。下面举一个例子详细介绍:

分区归档至外表

例如:客户有一张订单分区表,以订单创建时间作为分区键,每年一个分区。假设超过5年后,我们认为该订单被查询的几率不大,可以归档至 OSS 冷数据。其订单表结构如下:

  1. MySQL> show create table Orders;
  2. +--------+----------------------------------------------------------------------------------------------+
  3. | Table | Create Table |
  4. +--------+----------------------------------------------------------------------------------------------+
  5. | Orders | CREATE TABLE `Orders` ( |
  6. | | `order_id` int(11) NOT NULL AUTO_INCREMENT, |
  7. | | `customer_id` int(11) DEFAULT NULL, |
  8. | | `product_id` int(11) DEFAULT NULL, |
  9. | | `order_amount` decimal(10,2) DEFAULT NULL, |
  10. | | `create_time` datetime NOT NULL, |
  11. | | PRIMARY KEY (`order_id`,`create_time`) |
  12. | | ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
  13. | | /*!50500 PARTITION BY RANGE COLUMNS(create_time) */ /*!99990 800020200 INTERVAL(YEAR, 1) */ |
  14. | | /*!50500 (PARTITION p2011 VALUES LESS THAN ('2012-01-01') ENGINE = InnoDB, |
  15. | | PARTITION p2012 VALUES LESS THAN ('2013-01-01') ENGINE = InnoDB, |
  16. | | PARTITION p2013 VALUES LESS THAN ('2014-01-01') ENGINE = InnoDB) */ |
  17. +--------+----------------------------------------------------------------------------------------------+

该表上有 INTERVAL 属性,可以随着表数据的增加,自动创建新的分区。(关于 INTERVAL 功能的介绍,请参考:PolarDB MySQL INTERVAL RANGE分区)。随着数据上升,假设客户插入的数据已经到了2016年,则该表的定义会变为:

  1. MySQL> show create table Orders;
  2. +--------+----------------------------------------------------------------------------------------------+
  3. | Table | Create Table |
  4. +--------+----------------------------------------------------------------------------------------------+
  5. | Orders | CREATE TABLE `Orders` ( |
  6. | | `order_id` int(11) NOT NULL AUTO_INCREMENT, |
  7. | | `customer_id` int(11) DEFAULT NULL, |
  8. | | `product_id` int(11) DEFAULT NULL, |
  9. | | `order_amount` decimal(10,2) DEFAULT NULL, |
  10. | | `create_time` datetime NOT NULL, |
  11. | | PRIMARY KEY (`order_id`,`create_time`) |
  12. | | ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
  13. | | /*!50500 PARTITION BY RANGE COLUMNS(create_time) */ /*!99990 800020200 INTERVAL(YEAR, 1) */ |
  14. | | /*!50500 (PARTITION p2011 VALUES LESS THAN ('2012-01-01') ENGINE = InnoDB, |
  15. | | PARTITION p2012 VALUES LESS THAN ('2013-01-01') ENGINE = InnoDB, |
  16. | | PARTITION p2013 VALUES LESS THAN ('2014-01-01') ENGINE = InnoDB, |
  17. | | PARTITION _p20150101000000 VALUES LESS THAN ('2015-01-01 00:00:00') ENGINE = InnoDB, |
  18. | | PARTITION _p20160101000000 VALUES LESS THAN ('2016-01-01 00:00:00') ENGINE = InnoDB, |
  19. | | PARTITION _p20170101000000 VALUES LESS THAN ('2017-01-01 00:00:00') ENGINE = InnoDB) */ |
  20. +--------+----------------------------------------------------------------------------------------------+

此时,由于分区 p2011 的数据距离最新的分区已经超过五年,出于降本的需求,我们将该分区的数据转到另一张冷存表上。首先,打开 OSS META 开关,通过

  1. MySQL> set use_oss_meta = ON;

或者通过控制台设置参数 loose_use_oss_meta = ON ,然后重新连接实例。

随后通过以下存储过程将分区归档至表: > 该功能将会在 8.0.2.2.25 版本上线,具体请参考官网公告。

  1. MySQL> call dbms_dlm.archive_partition2table('test', 'Orders', 'p2011', 'test','Orders_archive', '');

这个存储过程会把当前表 test.Orders 表的分区 p2011 归档到 OSS 外表 test.Orders_archive,如果没有表test.Orders_archive,则会自动创建一个。归档完成后,看下当前数据库中的所有表,如下:

  1. MySQL> show tables;
  2. +-----------------------------+
  3. | Tables_in_test |
  4. +-----------------------------+
  5. | Orders |
  6. | Orders_archive |
  7. +-----------------------------+
  8. 2 rows in set
  9. Time: 0.020s

这时已经自动创建好了表 Orders_archive ,表结构和 Orders 基本一致,其建表语句为:

  1. MySQL> show create table Orders_archive;
  2. +----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  3. | Table | Create Table |
  4. +----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  5. | Orders_archive | CREATE TABLE `Orders_archive` ( |
  6. | | `order_id` int(11) NOT NULL DEFAULT '0', |
  7. | | `customer_id` int(11) DEFAULT NULL, |
  8. | | `product_id` int(11) DEFAULT NULL, |
  9. | | `order_amount` decimal(10,2) DEFAULT NULL, |
  10. | | `create_time` datetime NOT NULL |
  11. | | ) /*!99990 800020213 STORAGE OSS */ ENGINE=CSV DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci /*!99990 800020204 NULL_MARKER='NULL' */ /*!99990 800020223 OSS META=1 */ /*!99990 800020224 OSS_FILE_FILTER='order_id,create_time' */ |
  12. +----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  13. 1 row in set
  14. Time: 0.006s

可以看到表上带有 OSS META=1 的标记,说明当前表已经开启了 OSS META 功能。查询该表的数据,可以发现 p2011 分区的数据已经在表 Orders_archive 中了,并且 Orders 表中,分区 p2011 也被删除了。

  1. MySQL> show create table Orders;
  2. +--------+----------------------------------------------------------------------------------------------+
  3. | Table | Create Table |
  4. +--------+----------------------------------------------------------------------------------------------+
  5. | Orders | CREATE TABLE `Orders` ( |
  6. | | `order_id` int(11) NOT NULL AUTO_INCREMENT, |
  7. | | `customer_id` int(11) DEFAULT NULL, |
  8. | | `product_id` int(11) DEFAULT NULL, |
  9. | | `order_amount` decimal(10,2) DEFAULT NULL, |
  10. | | `create_time` datetime NOT NULL, |
  11. | | PRIMARY KEY (`order_id`,`create_time`) |
  12. | | ) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
  13. | | /*!50500 PARTITION BY RANGE COLUMNS(create_time) */ /*!99990 800020200 INTERVAL(YEAR, 1) */ |
  14. | | /*!50500 (PARTITION p2012 VALUES LESS THAN ('2013-01-01') ENGINE = InnoDB, |
  15. | | PARTITION p2013 VALUES LESS THAN ('2014-01-01') ENGINE = InnoDB, |
  16. | | PARTITION _p20150101000000 VALUES LESS THAN ('2015-01-01 00:00:00') ENGINE = InnoDB, |
  17. | | PARTITION _p20160101000000 VALUES LESS THAN ('2016-01-01 00:00:00') ENGINE = InnoDB, |
  18. | | PARTITION _p20170101000000 VALUES LESS THAN ('2017-01-01 00:00:00') ENGINE = InnoDB) */ |
  19. +--------+----------------------------------------------------------------------------------------------+
  20. 1 row in set
  21. Time: 0.028s

分区数据持续归档

随后,Orders 表随着数据持续的插入,可能会有更多的分区,例如:

  1. MySQL> show create table Orders;
  2. +--------+----------------------------------------------------------------------------------------------+
  3. | Table | Create Table |
  4. +--------+----------------------------------------------------------------------------------------------+
  5. | Orders | CREATE TABLE `Orders` ( |
  6. | | `order_id` int(11) NOT NULL AUTO_INCREMENT, |
  7. | | `customer_id` int(11) DEFAULT NULL, |
  8. | | `product_id` int(11) DEFAULT NULL, |
  9. | | `order_amount` decimal(10,2) DEFAULT NULL, |
  10. | | `create_time` datetime NOT NULL, |
  11. | | PRIMARY KEY (`order_id`,`create_time`) |
  12. | | ) ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
  13. | | /*!50500 PARTITION BY RANGE COLUMNS(create_time) */ /*!99990 800020200 INTERVAL(YEAR, 1) */ |
  14. | | /*!50500 (PARTITION p2012 VALUES LESS THAN ('2013-01-01') ENGINE = InnoDB, |
  15. | | PARTITION p2013 VALUES LESS THAN ('2014-01-01') ENGINE = InnoDB, |
  16. | | PARTITION _p20150101000000 VALUES LESS THAN ('2015-01-01 00:00:00') ENGINE = InnoDB, |
  17. | | PARTITION _p20160101000000 VALUES LESS THAN ('2016-01-01 00:00:00') ENGINE = InnoDB, |
  18. | | PARTITION _p20170101000000 VALUES LESS THAN ('2017-01-01 00:00:00') ENGINE = InnoDB, |
  19. | | PARTITION _p20180101000000 VALUES LESS THAN ('2018-01-01 00:00:00') ENGINE = InnoDB) */ |
  20. +--------+----------------------------------------------------------------------------------------------+
  21. 1 row in set
  22. Time: 0.006s

此时按照保留5个分区的策略,可以继续把分区 p2012 的数据归档至刚刚的 Orders_archive 表,sql为:

  1. MySQL> call dbms_dlm.archive_partition2table('test', 'Orders', 'p2012', 'test','Orders_archive', '');
  2. Query OK, 0 rows affected
  3. Time: 3.815s
  4. MySQL> show create table Orders;
  5. +--------+----------------------------------------------------------------------------------------------+
  6. | Table | Create Table |
  7. +--------+----------------------------------------------------------------------------------------------+
  8. | Orders | CREATE TABLE `Orders` ( |
  9. | | `order_id` int(11) NOT NULL AUTO_INCREMENT, |
  10. | | `customer_id` int(11) DEFAULT NULL, |
  11. | | `product_id` int(11) DEFAULT NULL, |
  12. | | `order_amount` decimal(10,2) DEFAULT NULL, |
  13. | | `create_time` datetime NOT NULL, |
  14. | | PRIMARY KEY (`order_id`,`create_time`) |
  15. | | ) ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
  16. | | /*!50500 PARTITION BY RANGE COLUMNS(create_time) */ /*!99990 800020200 INTERVAL(YEAR, 1) */ |
  17. | | /*!50500 (PARTITION p2013 VALUES LESS THAN ('2014-01-01') ENGINE = InnoDB, |
  18. | | PARTITION _p20150101000000 VALUES LESS THAN ('2015-01-01 00:00:00') ENGINE = InnoDB, |
  19. | | PARTITION _p20160101000000 VALUES LESS THAN ('2016-01-01 00:00:00') ENGINE = InnoDB, |
  20. | | PARTITION _p20170101000000 VALUES LESS THAN ('2017-01-01 00:00:00') ENGINE = InnoDB, |
  21. | | PARTITION _p20180101000000 VALUES LESS THAN ('2018-01-01 00:00:00') ENGINE = InnoDB) */ |
  22. +--------+----------------------------------------------------------------------------------------------+
  23. 1 row in set
  24. Time: 0.027s

归档后分区表 Orders 的数据进一步减少,可以显著降低存储成本。 但是,如果业务上有修改,例如,对分区表 Orders 需要加一列 price,表示当前订单的价格,加列后表定义为:

  1. MySQL> alter table Orders add column (price decimal (10,2) DEFAULT NULL), algorithm = instant;
  2. Query OK, 0 rows affected
  3. Time: 0.255s
  4. MySQL> show create table Orders;
  5. +--------+----------------------------------------------------------------------------------------------+
  6. | Table | Create Table |
  7. +--------+----------------------------------------------------------------------------------------------+
  8. | Orders | CREATE TABLE `Orders` ( |
  9. | | `order_id` int(11) NOT NULL AUTO_INCREMENT, |
  10. | | `customer_id` int(11) DEFAULT NULL, |
  11. | | `product_id` int(11) DEFAULT NULL, |
  12. | | `order_amount` decimal(10,2) DEFAULT NULL, |
  13. | | `create_time` datetime NOT NULL, |
  14. | | `price` decimal(10,2) DEFAULT NULL, |
  15. | | PRIMARY KEY (`order_id`,`create_time`) |
  16. | | ) ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
  17. | | /*!50500 PARTITION BY RANGE COLUMNS(create_time) */ /*!99990 800020200 INTERVAL(YEAR, 1) */ |
  18. | | /*!50500 (PARTITION p2013 VALUES LESS THAN ('2014-01-01') ENGINE = InnoDB, |
  19. | | PARTITION _p20150101000000 VALUES LESS THAN ('2015-01-01 00:00:00') ENGINE = InnoDB, |
  20. | | PARTITION _p20160101000000 VALUES LESS THAN ('2016-01-01 00:00:00') ENGINE = InnoDB, |
  21. | | PARTITION _p20170101000000 VALUES LESS THAN ('2017-01-01 00:00:00') ENGINE = InnoDB, |
  22. | | PARTITION _p20180101000000 VALUES LESS THAN ('2018-01-01 00:00:00') ENGINE = InnoDB) */ |
  23. +--------+----------------------------------------------------------------------------------------------+
  24. 1 row in set
  25. Time: 0.006s

此时,加完列后,如果我们想继续把分区 p2013 的数据归档至刚刚的 OSSOrders_archive,会报错。这是因为 Orders 表和 Orders_archive 表的定义出现了不一致,报错如下:

  1. MySQL> call dbms_dlm.archive_partition2table('test', 'Orders', 'p2013', 'test','Orders_archive', '');
  2. (8181, '[Data Lifecycle Management] errmsg: The metadata information of the original table is inconsistent with that of the target table.')

此时,我们需要给表 Orders_archive 也加上 price 列,就可以正常归档了,sql 为:

  1. MySQL> alter table Orders_archive add column (price decimal (10,2) DEFAULT NULL), algorithm = instant;
  2. Query OK, 0 rows affected
  3. Time: 0.348s

注意,

  • 只有 OSS 表上带有 OSS META = 1 标记时,才能执行 instant DDL 操作,否则会变成 copy DDL
  • 可以在 DDL sql 中强制指定算法为 INSTANT,避免出现预期外的 COPY 导致代价过大。

此时看下表 Order_archive 的表结构:

  1. MySQL> show create table Orders_archive;
  2. +----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  3. | Table | Create Table |
  4. +----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  5. | Orders_archive | CREATE TABLE `Orders_archive` ( |
  6. | | `order_id` int(11) NOT NULL DEFAULT '0', |
  7. | | `customer_id` int(11) DEFAULT NULL, |
  8. | | `product_id` int(11) DEFAULT NULL, |
  9. | | `order_amount` decimal(10,2) DEFAULT NULL, |
  10. | | `create_time` datetime NOT NULL, |
  11. | | `price` decimal(10,2) DEFAULT NULL |
  12. | | ) /*!99990 800020213 STORAGE OSS */ ENGINE=CSV DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci /*!99990 800020204 NULL_MARKER='NULL' */ /*!99990 800020223 OSS META=1 */ /*!99990 800020224 OSS_FILE_FILTER='order_id,create_time' */ |
  13. +----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  14. 1 row in set
  15. Time: 0.250s

已经和表 Orders 一致了,此时再归档分区 p2013,就不会报错了。

  1. MySQL> call dbms_dlm.archive_partition2table('test', 'Orders', 'p2013', 'test','Orders_archive', '');
  2. Query OK, 0 rows affected
  3. Time: 4.969s
  4. MySQL> show create table Orders;
  5. +--------+------------------------------------------------------------------------------------------------+
  6. | Table | Create Table |
  7. +--------+------------------------------------------------------------------------------------------------+
  8. | Orders | CREATE TABLE `Orders` ( |
  9. | | `order_id` int(11) NOT NULL AUTO_INCREMENT, |
  10. | | `customer_id` int(11) DEFAULT NULL, |
  11. | | `product_id` int(11) DEFAULT NULL, |
  12. | | `order_amount` decimal(10,2) DEFAULT NULL, |
  13. | | `create_time` datetime NOT NULL, |
  14. | | `price` decimal(10,2) DEFAULT NULL, |
  15. | | PRIMARY KEY (`order_id`,`create_time`) |
  16. | | ) ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
  17. | | /*!50500 PARTITION BY RANGE COLUMNS(create_time) */ /*!99990 800020200 INTERVAL(YEAR, 1) */ |
  18. | | /*!50500 (PARTITION _p20150101000000 VALUES LESS THAN ('2015-01-01 00:00:00') ENGINE = InnoDB, |
  19. | | PARTITION _p20160101000000 VALUES LESS THAN ('2016-01-01 00:00:00') ENGINE = InnoDB, |
  20. | | PARTITION _p20170101000000 VALUES LESS THAN ('2017-01-01 00:00:00') ENGINE = InnoDB, |
  21. | | PARTITION _p20180101000000 VALUES LESS THAN ('2018-01-01 00:00:00') ENGINE = InnoDB) */ |
  22. +--------+------------------------------------------------------------------------------------------------+
  23. 1 row in set
  24. Time: 0.024s

可以看到分区数据已经归档至 Orders_archive 表里面了。

OSS META 功能的开启与关闭

通过以上可以看出,OSS META 开启后,对当前冷存表的 DDL 能力有极大的增强。更多的信息可以参考官网文档OSS DDL 能力介绍。 当然,如果您当前表已经归档成 OSS 表了,希望开启 OSS META 功能以支持更好的 DDL 操作,可以通过 REPAIR TABLE 语句实现,例如我们有一张归档表 t

  1. MySQL> show create table t;
  2. +-------+--------------------------------------------------------------------------------------------------------------------------------------------+
  3. | Table | Create Table |
  4. +-------+--------------------------------------------------------------------------------------------------------------------------------------------+
  5. | t | CREATE TABLE `t` ( |
  6. | | `id` int(11) DEFAULT NULL |
  7. | | ) /*!99990 800020213 STORAGE OSS */ ENGINE=CSV DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci /*!99990 800020204 NULL_MARKER='NULL' */ |
  8. +-------+--------------------------------------------------------------------------------------------------------------------------------------------+
  9. 1 row in set
  10. Time: 0.008s

此时表 t 上没有 oss meta 标记,可以通过 repair table 增加:

  1. MySQL> repair table t;
  2. +--------+--------+----------+----------+
  3. | Table | Op | Msg_type | Msg_text |
  4. +--------+--------+----------+----------+
  5. | test.t | repair | status | OK |
  6. +--------+--------+----------+----------+
  7. 1 row in set
  8. Time: 0.863s
  9. MySQL> show create table t;
  10. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  11. | Table | Create Table |
  12. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  13. | t | CREATE TABLE `t` ( |
  14. | | `id` int(11) DEFAULT NULL |
  15. | | ) /*!99990 800020213 STORAGE OSS */ ENGINE=CSV DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci /*!99990 800020204 NULL_MARKER='NULL' */ /*!99990 800020223 OSS META=1 */ |
  16. +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  17. 1 row in set
  18. Time: 0.014s

注意,repair table 是一个 inplace 操作,对于 OSS 表来说,其整体时间与表的大小相关。 当然,也可以关闭当前表的 OSS META 功能,通过 disable OSS META 实现:

  1. MySQL> ALTER TABLE t DISABLE OSS META;
  2. Query OK, 0 rows affected
  3. Time: 0.049s
  4. MySQL> show create table t;
  5. +-------+--------------------------------------------------------------------------------------------------------------------------------------------+
  6. | Table | Create Table |
  7. +-------+--------------------------------------------------------------------------------------------------------------------------------------------+
  8. | t | CREATE TABLE `t` ( |
  9. | | `id` int(11) DEFAULT NULL |
  10. | | ) /*!99990 800020213 STORAGE OSS */ ENGINE=CSV DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci /*!99990 800020204 NULL_MARKER='NULL' */ |
  11. +-------+--------------------------------------------------------------------------------------------------------------------------------------------+
  12. 1 row in set

关闭 OSS META 是一个 instant 操作,只修改表的元数据。

总结

PolarDB MySQL8.0.2.2.24 版本引入了冷数据 OSS META 功能,显著优化了冷数据的 DDL 操作。在过去,由于 OSS 冷数据的 DDL 能力有限,大部分 DDL 操作都需要使用 COPY 算法,导致修改表定义的成本很高。现在,通过增强对冷数据元数据的管理,某些DDL操作可以直接应用,无需全量数据重写,提高了效率和灵活性。当启用 OSS META功能后,例如在处理分区归档到OSS的场景中,可以更方便地进行表结构的修改。总的来说,OSS META 功能提升了PolarDB MySQL在处理冷数据DDL时的性能,降低了维护成本,使得冷数据管理更加便捷高效。

原文:http://mysql.taobao.org/monthly/2024/07/01/