WINDOW FUNCTION PERCENT_RANK

description

The PERCENT_RANK() is a window function that calculates the percentile rank of a row within a partition or result set.

The following shows the syntax of the PERCENT_RANK() function:

  1. PERCENT_RANK() OVER (
  2. PARTITION BY partition_expression
  3. ORDER BY
  4. sort_expression [ASC | DESC]
  5. )

The PERCENT_RANK() function returns a number that ranges from zero to one.

For a specified row, PERCENT_RANK() calculates the rank of that row minus one, divided by 1 less than the number of rows in the evaluated partition or query result set:

  1. (rank - 1) / (total_rows - 1)

In this formula, rank is the rank of a specified row and total_rows is the number of rows being evaluated.

The PERCENT_RANK() function always returns zero for the first row in a partition or result set. The repeated column values will receive the same PERCENT_RANK() value.

Similar to other window functions, the PARTITION BY clause distributes the rows into partitions and the ORDER BY clause specifies the logical order of rows in each partition. The PERCENT_RANK() function is calculated for each ordered partition independently.

Both PARTITION BY and ORDER BY clauses are optional. However, the PERCENT_RANK() is an order-sensitive function, therefore, you should always use the ORDER BY clause.

example

  1. // create table
  2. CREATE TABLE test_percent_rank (
  3. productLine VARCHAR,
  4. orderYear INT,
  5. orderValue DOUBLE,
  6. percentile_rank DOUBLE
  7. ) ENGINE=OLAP
  8. DISTRIBUTED BY HASH(`orderYear`) BUCKETS 4
  9. PROPERTIES (
  10. "replication_allocation" = "tag.location.default: 1"
  11. );
  12. // insert data into table
  13. INSERT INTO test_percent_rank (productLine, orderYear, orderValue, percentile_rank) VALUES
  14. ('Motorcycles', 2003, 2440.50, 0.00),
  15. ('Trains', 2003, 2770.95, 0.17),
  16. ('Trucks and Buses', 2003, 3284.28, 0.33),
  17. ('Vintage Cars', 2003, 4080.00, 0.50),
  18. ('Planes', 2003, 4825.44, 0.67),
  19. ('Ships', 2003, 5072.71, 0.83),
  20. ('Classic Cars', 2003, 5571.80, 1.00),
  21. ('Motorcycles', 2004, 2598.77, 0.00),
  22. ('Vintage Cars', 2004, 2819.28, 0.17),
  23. ('Planes', 2004, 2857.35, 0.33),
  24. ('Ships', 2004, 4301.15, 0.50),
  25. ('Trucks and Buses', 2004, 4615.64, 0.67),
  26. ('Trains', 2004, 4646.88, 0.83),
  27. ('Classic Cars', 2004, 8124.98, 1.00),
  28. ('Ships', 2005, 1603.20, 0.00),
  29. ('Motorcycles', 2005, 3774.00, 0.17),
  30. ('Planes', 2005, 4018.00, 0.50),
  31. ('Vintage Cars', 2005, 5346.50, 0.67),
  32. ('Classic Cars', 2005, 5971.35, 0.83),
  33. ('Trucks and Buses', 2005, 6295.03, 1.00);
  34. // query
  35. SELECT
  36. productLine,
  37. orderYear,
  38. orderValue,
  39. ROUND(
  40. PERCENT_RANK()
  41. OVER (
  42. PARTITION BY orderYear
  43. ORDER BY orderValue
  44. ),2) percentile_rank
  45. FROM
  46. test_percent_rank
  47. ORDER BY
  48. orderYear;
  49. // result
  50. +------------------+-----------+------------+-----------------+
  51. | productLine | orderYear | orderValue | percentile_rank |
  52. +------------------+-----------+------------+-----------------+
  53. | Motorcycles | 2003 | 2440.5 | 0 |
  54. | Trains | 2003 | 2770.95 | 0.17 |
  55. | Trucks and Buses | 2003 | 3284.28 | 0.33 |
  56. | Vintage Cars | 2003 | 4080 | 0.5 |
  57. | Planes | 2003 | 4825.44 | 0.67 |
  58. | Ships | 2003 | 5072.71 | 0.83 |
  59. | Classic Cars | 2003 | 5571.8 | 1 |
  60. | Motorcycles | 2004 | 2598.77 | 0 |
  61. | Vintage Cars | 2004 | 2819.28 | 0.17 |
  62. | Planes | 2004 | 2857.35 | 0.33 |
  63. | Ships | 2004 | 4301.15 | 0.5 |
  64. | Trucks and Buses | 2004 | 4615.64 | 0.67 |
  65. | Trains | 2004 | 4646.88 | 0.83 |
  66. | Classic Cars | 2004 | 8124.98 | 1 |
  67. | Ships | 2005 | 1603.2 | 0 |
  68. | Motorcycles | 2005 | 3774 | 0.2 |
  69. | Planes | 2005 | 4018 | 0.4 |
  70. | Vintage Cars | 2005 | 5346.5 | 0.6 |
  71. | Classic Cars | 2005 | 5971.35 | 0.8 |
  72. | Trucks and Buses | 2005 | 6295.03 | 1 |
  73. +------------------+-----------+------------+-----------------+

keywords

  1. WINDOW,FUNCTION,PERCENT_RANK