GBase 8a支持cume_dist 函数,用于计算小于等于或者大于等于(根据order的顺序)该值的百分比。
目录导航
语法
cume_disk() over([partition by ] order by [desc])
说明
其中partition是否开窗,否则全部数值统一处理。
order by 的顺序,ASC(默认) = 小于等于,DESC =大于等于
样例
小于等于
如下例子是默认的升序,第一行表示:小于等于数值1的行数比例为20%
gbase> select id,cume_dist()over(order by id) cr from t2;
+------+-----+
| id   | cr  |
+------+-----+
|    1 | 0.2 |
|    2 | 0.4 |
|    3 | 0.6 |
|    4 | 0.8 |
|    5 |   1 |
+------+-----+
5 rows in set (Elapsed: 00:00:00.01)
大于等于
排序为desc,第一行表示:大于等于数值5的行数比例为20%。
gbase> select id,cume_dist()over(order by id desc) cr from t2;
+------+-----+
| id   | cr  |
+------+-----+
|    5 | 0.2 |
|    4 | 0.4 |
|    3 | 0.6 |
|    2 | 0.8 |
|    1 |   1 |
+------+-----+
5 rows in set (Elapsed: 00:00:00.02)
带开窗partition
每个partition内不分别计算百分比。
gbase> select * from t4;
+------+------+
| id   | type |
+------+------+
|    1 | A    |
|    2 | A    |
|    3 | A    |
|    1 | B    |
|    2 | B    |
|    3 | B    |
|    4 | B    |
+------+------+
7 rows in set (Elapsed: 00:00:00.00)
gbase> select type,id,cume_dist()over(partition by type order by id) cr from t4;
+------+------+-------------------+
| type | id   | cr                |
+------+------+-------------------+
| A    |    1 | 0.333333333333333 |
| A    |    2 | 0.666666666666667 |
| A    |    3 |                 1 |
| B    |    1 |              0.25 |
| B    |    2 |               0.5 |
| B    |    3 |              0.75 |
| B    |    4 |                 1 |
+------+------+-------------------+
7 rows in set (Elapsed: 00:00:00.07)
与Percent_rank的对比
percent_rank是计算相对位置,包含起点0,而cume_dist是包含等于的,所以不会出现0。如果数据只有1行,那么percent_rank为
起点0, 而cume_dist为1(100%)。
gbase> select id,cume_dist()over(order by id) cr,percent_rank()over(order by id) pr  from t2;
+------+-----+------+
| id   | cr  | pr   |
+------+-----+------+
|    1 | 0.2 |    0 |
|    2 | 0.4 | 0.25 |
|    3 | 0.6 |  0.5 |
|    4 | 0.8 | 0.75 |
|    5 |   1 |    1 |
+------+-----+------+
5 rows in set (Elapsed: 00:00:00.01)
一行数据
gbase> select id,cume_dist()over(order by id) cr,percent_rank()over(order by id) pr  from t5;
+------+----+----+
| id   | cr | pr |
+------+----+----+
|    1 |  1 |  0 |
+------+----+----+
1 row in set (Elapsed: 00:00:00.02)