SHOW-DATA-SKEW

Name

SHOW DATA SKEW

Description

  1. 该语句用于查看表或某个分区的数据倾斜情况。
  2. 语法:
  3. SHOW DATA SKEW FROM [db_name.]tbl_name [PARTITION (partition_name, ...)];
  4. 说明:
  5. 1. 结果将展示指定分区下,各个分桶的数据行数,数据量,以及每个分桶数据量在总数据量中的占比。
  6. 2. 对于非分区表,查询结果中分区名称同表名。

Example

  1. 分区表场景
  • 建表语句

    1. CREATE TABLE test_show_data_skew
    2. (
    3. id int,
    4. name string,
    5. pdate date
    6. )
    7. PARTITION BY RANGE(pdate)
    8. (
    9. FROM ("2023-04-16") TO ("2023-04-20") INTERVAL 1 DAY
    10. )
    11. DISTRIBUTED BY HASH(id) BUCKETS 5
    12. PROPERTIES (
    13. "replication_num" = "1"
    14. );
  • 查询整表的数据倾斜情况

    1. mysql> SHOW DATA SKEW FROM test_show_data_skew;
    2. +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+
    3. | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent |
    4. +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+
    5. | p_20230416 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.77 % |
    6. | p_20230416 | 1 | 2 | 654 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.23 % |
    7. | p_20230416 | 2 | 0 | 0 | | 00.00 % |
    8. | p_20230416 | 3 | 0 | 0 | | 00.00 % |
    9. | p_20230416 | 4 | 0 | 0 | | 00.00 % |
    10. | p_20230417 | 0 | 0 | 0 | | 00.00 % |
    11. | p_20230417 | 1 | 0 | 0 | | 00.00 % |
    12. | p_20230417 | 2 | 0 | 0 | | 00.00 % |
    13. | p_20230417 | 3 | 0 | 0 | | 00.00 % |
    14. | p_20230417 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% |
    15. | p_20230418 | 0 | 0 | 0 | | 00.00 % |
    16. | p_20230418 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% |
    17. | p_20230418 | 2 | 0 | 0 | | 00.00 % |
    18. | p_20230418 | 3 | 0 | 0 | | 00.00 % |
    19. | p_20230418 | 4 | 0 | 0 | | 00.00 % |
    20. | p_20230419 | 0 | 0 | 0 | | 00.00 % |
    21. | p_20230419 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.96 % |
    22. | p_20230419 | 2 | 0 | 0 | | 00.00 % |
    23. | p_20230419 | 3 | 1 | 649 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.04 % |
    24. | p_20230419 | 4 | 0 | 0 | | 00.00 % |
    25. +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+
  • 查询指定分区的数据倾斜情况

    1. mysql> SHOW DATA SKEW FROM test_show_data_skew PARTITION(p_20230416, p_20230418);
    2. +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+
    3. | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent |
    4. +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+
    5. | p_20230416 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 49.77 % |
    6. | p_20230416 | 1 | 2 | 654 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 50.23 % |
    7. | p_20230416 | 2 | 0 | 0 | | 00.00 % |
    8. | p_20230416 | 3 | 0 | 0 | | 00.00 % |
    9. | p_20230416 | 4 | 0 | 0 | | 00.00 % |
    10. | p_20230418 | 0 | 0 | 0 | | 00.00 % |
    11. | p_20230418 | 1 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | 100.00% |
    12. | p_20230418 | 2 | 0 | 0 | | 00.00 % |
    13. | p_20230418 | 3 | 0 | 0 | | 00.00 % |
    14. | p_20230418 | 4 | 0 | 0 | | 00.00 % |
    15. +---------------+-----------+-------------+-------------+------------------------------------------------------------------------------------------------------+---------+
  1. 非分区表场景
  • 建表语句

    1. CREATE TABLE test_show_data_skew2
    2. (
    3. id int,
    4. name string,
    5. pdate date
    6. )
    7. DISTRIBUTED BY HASH(id) BUCKETS 5
    8. PROPERTIES (
    9. "replication_num" = "1"
    10. );
  • 查询整表的数据倾斜情况

    1. mysql> SHOW DATA SKEW FROM test_show_data_skew2;
    2. +----------------------+-----------+-------------+-------------+---------------------------+---------+
    3. | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent |
    4. +----------------------+-----------+-------------+-------------+---------------------------+---------+
    5. | test_show_data_skew2 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.73 % |
    6. | test_show_data_skew2 | 1 | 4 | 667 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.46 % |
    7. | test_show_data_skew2 | 2 | 0 | 0 | | 00.00 % |
    8. | test_show_data_skew2 | 3 | 1 | 649 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.77 % |
    9. | test_show_data_skew2 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.04 % |
    10. +----------------------+-----------+-------------+-------------+---------------------------+---------+
  1. mysql> SHOW DATA SKEW FROM test_show_data_skew2 PARTITION(test_show_data_skew2);
  2. +----------------------+-----------+-------------+-------------+---------------------------+---------+
  3. | PartitionName | BucketIdx | AvgRowCount | AvgDataSize | Graph | Percent |
  4. +----------------------+-----------+-------------+-------------+---------------------------+---------+
  5. | test_show_data_skew2 | 0 | 1 | 648 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.73 % |
  6. | test_show_data_skew2 | 1 | 4 | 667 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.46 % |
  7. | test_show_data_skew2 | 2 | 0 | 0 | | 00.00 % |
  8. | test_show_data_skew2 | 3 | 1 | 649 | >>>>>>>>>>>>>>>>>>>>>>>> | 24.77 % |
  9. | test_show_data_skew2 | 4 | 2 | 656 | >>>>>>>>>>>>>>>>>>>>>>>>> | 25.04 % |
  10. +----------------------+-----------+-------------+-------------+---------------------------+---------+
  11. ```

Keywords

  1. SHOW,DATA,SKEW

Best Practice