width_bucket

Description

Constructs equi-width histograms, in which the histogram range is divided into intervals of identical size, and returns the bucket number into which the value of an expression falls, after it has been evaluated. The function returns an integer value or null (if any input is null).

Syntax

INT width_bucket(Expr expr, T min_value, T max_value, INT num_buckets)

Arguments

expr - The expression for which the histogram is created. This expression must evaluate to a numeric value or to a value that can be implicitly converted to a numeric value.

The value must be within the range of -(2^53 - 1) to 2^53 - 1 (inclusive).

min_value and max_value - The low and high end points of the acceptable range for the expression. The end points must also evaluate to numeric values and not be equal.

The low and high end points must be within the range of -(2^53 - 1) to 2^53 - 1 (inclusive). In addition, the difference between these points must be less than 2^53 (i.e. abs(max_value - min_value) < 2^53).

num_buckets - The desired number of buckets; must be a positive integer value. A value from the expression is assigned to each bucket, and the function then returns the corresponding bucket number.

Returned value

It returns the bucket number into which the value of an expression falls.

When an expression falls outside the range, the function returns:

0 if the expression is less than min_value.

num_buckets + 1 if the expression is greater than or equal to max_value.

null if any input is null.

example

  1. DROP TABLE IF EXISTS width_bucket_test;
  2. CREATE TABLE IF NOT EXISTS width_bucket_test (
  3. `k1` int NULL COMMENT "",
  4. `v1` date NULL COMMENT "",
  5. `v2` double NULL COMMENT "",
  6. `v3` bigint NULL COMMENT ""
  7. ) ENGINE=OLAP
  8. DUPLICATE KEY(`k1`)
  9. DISTRIBUTED BY HASH(`k1`) BUCKETS 1
  10. PROPERTIES (
  11. "replication_allocation" = "tag.location.default: 1",
  12. "storage_format" = "V2"
  13. );
  14. INSERT INTO width_bucket_test VALUES (1, "2022-11-18", 290000.00, 290000),
  15. (2, "2023-11-18", 320000.00, 320000),
  16. (3, "2024-11-18", 399999.99, 399999),
  17. (4, "2025-11-18", 400000.00, 400000),
  18. (5, "2026-11-18", 470000.00, 470000),
  19. (6, "2027-11-18", 510000.00, 510000),
  20. (7, "2028-11-18", 610000.00, 610000),
  21. (8, null, null, null);
  22. SELECT * FROM width_bucket_test ORDER BY k1;
  23. +------+------------+-----------+--------+
  24. | k1 | v1 | v2 | v3 |
  25. +------+------------+-----------+--------+
  26. | 1 | 2022-11-18 | 290000 | 290000 |
  27. | 2 | 2023-11-18 | 320000 | 320000 |
  28. | 3 | 2024-11-18 | 399999.99 | 399999 |
  29. | 4 | 2025-11-18 | 400000 | 400000 |
  30. | 5 | 2026-11-18 | 470000 | 470000 |
  31. | 6 | 2027-11-18 | 510000 | 510000 |
  32. | 7 | 2028-11-18 | 610000 | 610000 |
  33. | 8 | NULL | NULL | NULL |
  34. +------+------------+-----------+--------+
  35. SELECT k1, v1, v2, v3, width_bucket(v1, date('2023-11-18'), date('2027-11-18'), 4) AS w FROM width_bucket_test ORDER BY k1;
  36. +------+------------+-----------+--------+------+
  37. | k1 | v1 | v2 | v3 | w |
  38. +------+------------+-----------+--------+------+
  39. | 1 | 2022-11-18 | 290000 | 290000 | 0 |
  40. | 2 | 2023-11-18 | 320000 | 320000 | 1 |
  41. | 3 | 2024-11-18 | 399999.99 | 399999 | 2 |
  42. | 4 | 2025-11-18 | 400000 | 400000 | 3 |
  43. | 5 | 2026-11-18 | 470000 | 470000 | 4 |
  44. | 6 | 2027-11-18 | 510000 | 510000 | 5 |
  45. | 7 | 2028-11-18 | 610000 | 610000 | 5 |
  46. | 8 | NULL | NULL | NULL | NULL |
  47. +------+------------+-----------+--------+------+
  48. SELECT k1, v1, v2, v3, width_bucket(v2, 200000, 600000, 4) AS w FROM width_bucket_test ORDER BY k1;
  49. +------+------------+-----------+--------+------+
  50. | k1 | v1 | v2 | v3 | w |
  51. +------+------------+-----------+--------+------+
  52. | 1 | 2022-11-18 | 290000 | 290000 | 1 |
  53. | 2 | 2023-11-18 | 320000 | 320000 | 2 |
  54. | 3 | 2024-11-18 | 399999.99 | 399999 | 2 |
  55. | 4 | 2025-11-18 | 400000 | 400000 | 3 |
  56. | 5 | 2026-11-18 | 470000 | 470000 | 3 |
  57. | 6 | 2027-11-18 | 510000 | 510000 | 4 |
  58. | 7 | 2028-11-18 | 610000 | 610000 | 5 |
  59. | 8 | NULL | NULL | NULL | NULL |
  60. +------+------------+-----------+--------+------+
  61. SELECT k1, v1, v2, v3, width_bucket(v3, 200000, 600000, 4) AS w FROM width_bucket_test ORDER BY k1;
  62. +------+------------+-----------+--------+------+
  63. | k1 | v1 | v2 | v3 | w |
  64. +------+------------+-----------+--------+------+
  65. | 1 | 2022-11-18 | 290000 | 290000 | 1 |
  66. | 2 | 2023-11-18 | 320000 | 320000 | 2 |
  67. | 3 | 2024-11-18 | 399999.99 | 399999 | 2 |
  68. | 4 | 2025-11-18 | 400000 | 400000 | 3 |
  69. | 5 | 2026-11-18 | 470000 | 470000 | 3 |
  70. | 6 | 2027-11-18 | 510000 | 510000 | 4 |
  71. | 7 | 2028-11-18 | 610000 | 610000 | 5 |
  72. | 8 | NULL | NULL | NULL | NULL |
  73. +------+------------+-----------+--------+------+

keywords

WIDTH_BUCKET