ClickHouse 窗口函数row_number,Rank Over的使用

在当前版本的ClickHouse里,已经实验性的支持了窗口函数, 比如Rank over partition order,但从执行看还是处于实验阶段,后面【可能】存在底层改动,需要设置参数Set allow_experimental_window_functions = 1。

版本

clickhouse-client-21.3.4.25-2.noarch
clickhouse-server-21.3.4.25-2.noarch
clickhouse-common-static-21.3.4.25-2.x86_64 

样例

数据

gbase_rh7_001 :) show create table t1;

SHOW CREATE TABLE t1

Query id: f6fece5e-23f8-4787-8acf-9cca89d7ab5c

┌─statement────────────────────────────────────────────────────────────────────────────────────┐
│ CREATE TABLE testdb.t1
(
    `id` Int32,
    `name` String,
    `birthday` Date
)
ENGINE = TinyLog │
└──────────────────────────────────────────────────────────────────────────────────────────────┘

1 rows in set. Elapsed: 0.004 sec.
gbase_rh7_001 :) select * from t1;

SELECT *
FROM t1

Query id: 7e7b2912-9f1b-4462-856f-5864de752476

┌─id─┬─name───┬───birthday─┐
│  1 │ First  │ 2011-01-01 │
│  2 │ Second │ 2012-02-02 │
│  3 │ Second │ 2011-01-01 │
└────┴────────┴────────────┘

3 rows in set. Elapsed: 0.006 sec.

设置参数

直接执行会报错,需要设置参数 Set allow_experimental_window_functions = 1

gbase_rh7_001 :) select id,name,rank() over (partition by birthday order by name) ran  from t1;

SELECT
    id,
    name,
    rank() OVER (PARTITION BY birthday ORDER BY name ASC) AS ran
FROM t1

Query id: 152d1f62-8ddb-4fbd-b46a-764234004851


0 rows in set. Elapsed: 0.003 sec.

Received exception from server (version 21.3.4):
Code: 48. DB::Exception: Received from localhost:9000. DB::Exception: The support for window functions is experimental and will change in backwards-incompatible ways in the future releases. Set allow_experimental_window_functions = 1 to enable it. While processing 'rank() OVER (PARTITION BY birthday ORDER BY name ASC) AS ran'.

gbase_rh7_001 :)
gbase_rh7_001 :) set allow_experimental_window_functions=1;

SET allow_experimental_window_functions = 1

Query id: 976e4a63-1c6c-44e9-b0ad-9de1dd28b5f6

Ok.

0 rows in set. Elapsed: 0.029 sec.

执行效果

功能没问题,性能还没测试过。

gbase_rh7_001 :) select id,name,rank() over (partition by birthday order by name) ran  from t1;

SELECT
    id,
    name,
    rank() OVER (PARTITION BY birthday ORDER BY name ASC) AS ran
FROM t1

Query id: 8461d23b-c244-42ab-9f51-d0ea8683727d

┌─id─┬─name───┬─ran─┐
│  1 │ First  │   1 │
│  3 │ Second │   2 │
│  2 │ Second │   1 │
└────┴────────┴─────┘

3 rows in set. Elapsed: 0.085 sec.

gbase_rh7_001 :)

row_number over的例子

注意其中的limit 1 by name的用法,不用再像其它数据库那样嵌套一层rownum了。直接可以limit。

gbase_rh7_001 :) select id,name,row_number() over (partition by birthday order by name) ran  from t1;

SELECT
    id,
    name,
    row_number() OVER (PARTITION BY birthday ORDER BY name ASC) AS ran
FROM t1

Query id: 5446a392-7591-4c17-8618-a34cec6a7d38

┌─id─┬─name───┬─ran─┐
│  1 │ First  │   1 │
│  3 │ Second │   2 │
│  2 │ Second │   1 │
└────┴────────┴─────┘

3 rows in set. Elapsed: 0.007 sec.

gbase_rh7_001 :) select id,name,row_number() over (partition by birthday order by name) ran  from t1 limit for 2;

Syntax error: failed at position 95 ('2'):

select id,name,row_number() over (partition by birthday order by name) ran  from t1 limit for 2;

Expected one of: UNION, LIMIT, BY, LIKE, GLOBAL NOT IN, end of query, AS, DIV, IS, INTO OUTFILE, OR, QuestionMark, BETWEEN, OFFSET, NOT LIKE, MOD, AND, Comma, alias, WITH TIES, SETTINGS, IN, ILIKE, FORMAT, Dot, NOT ILIKE, NOT, Arrow, token, NOT IN, GLOBAL IN

gbase_rh7_001 :) select id,name,row_number() over (partition by birthday order by name) ran  from t1 limit 1 by name;

SELECT
    id,
    name,
    row_number() OVER (PARTITION BY birthday ORDER BY name ASC) AS ran
FROM t1
LIMIT 1 BY name

Query id: a325089c-7201-468f-95f6-1ea446bacb89

┌─id─┬─name───┬─ran─┐
│  1 │ First  │   1 │
│  3 │ Second │   2 │
└────┴────────┴─────┘

2 rows in set. Elapsed: 0.013 sec.

gbase_rh7_001 :)