GBase 8a cluster supports data masking to ensure the security of sensitive data. Regarding filtering with WHERE clauses, grouping, and other operations on masked data, are these operations performed on the original data or on the masked data?
Directory Navigation
参考
测试环境
一个单列的数字类型的表,用的是默认脱敏策略。
CREATE TABLE "testmask" (
"id" int(11) DEFAULT NULL MASKED WITH(FUNCTION='DEFAULT()')
)
表里面有几行测试数据
gbase> select * from t_enc2;
+------+
| id |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
| 99 |
+------+
5 rows in set (Elapsed: 00:00:00.00)
无脱敏权限用户全表查询
可以看到id列全部变成了0,看不到原始数据,符合脱敏的预期。
gbase> select * from testmask;
+------+
| id |
+------+
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
+------+
5 rows in set (Elapsed: 00:00:00.02)
无脱敏权限用户where等值查询脱敏列
针对id=1的条件,返回了一行结果级。 从结果看,虽然id显示为0,但实际上执行SQL的用户已经猜到了id为1。
gbase> select * from testmask where id=1;
+------+
| id |
+------+
| 0 |
+------+
1 row in set (Elapsed: 00:00:00.01)
无脱敏权限用户group by脱敏列
从结果看,脱敏列的group用的是原始数据
gbase> select id%3,count(*) from testmask group by id%3;
+------+----------+
| id%3 | count(*) |
+------+----------+
| 0 | 2 |
| 0 | 1 |
| 0 | 2 |
+------+----------+
3 rows in set (Elapsed: 00:00:00.09)
无脱敏权限用户,用update迁移脱敏列数据
从结果看,update到新列的数据,是脱敏后的数据。
gbase> alter table testmask add column id2 int;
Query OK, 5 rows affected (Elapsed: 00:00:00.15)
Records: 5 Duplicates: 5 Warnings: 0
gbase> update testmask set id2=id;
Query OK, 5 rows affected (Elapsed: 00:00:00.07)
Rows matched: 5 Changed: 5 Warnings: 0
gbase> select * from testmask;
+------+------+
| id | id2 |
+------+------+
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
+------+------+
5 rows in set (Elapsed: 00:00:00.01)
查看表结构,没有脱敏属性,表示数据库内保存的id2就是0。
gbase> show create table testmask;
+----------+--------------------------------------+
| Table | Create Table |
+----------+------------------------------------------------+
| testmask | CREATE TABLE "testmask" (
"id" int(11) DEFAULT NULL MASKED WITH(FUNCTION='DEFAULT()'),
"id2" int(11) DEFAULT NULL
) ENGINE=EXPRESS DEFAULT CHARSET=utf8 TABLESPACE='sys_tablespace' |
+----------+--------------------------------------------+
1 row in set (Elapsed: 00:00:00.00)
gbase>
无脱敏权限用户,用insert select 转储脱敏列数据
转储出去的数据是脱敏后的。
gbase> create table testmask2 as select * from testmask;
Query OK, 5 rows affected (Elapsed: 00:00:00.19)
gbase> select * from testmask2;
+------+------+
| id | id2 |
+------+------+
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
| 0 | 0 |
+------+------+
5 rows in set (Elapsed: 00:00:00.00)
无脱敏权限用户,对脱敏列排序
gbase> select rowid,t.* from testmask t order by id desc;
+-------+------+------+
| rowid | id | id2 |
+-------+------+------+
| 4 | 0 | 0 |
| 3 | 0 | 0 |
| 2 | 0 | 0 |
| 1 | 0 | 0 |
| 0 | 0 | 0 |
+-------+------+------+
5 rows in set (Elapsed: 00:00:00.03)
Summary
从如上几个测试结果可以得出如下结论
- 脱敏列作为投影列返回给应用时,会被脱敏
- 脱敏列作为投影列转储到其它列或其它表时,会被脱敏
- 脱敏列作为非投影列,比如where过滤,group, order等时,不会被脱敏,而是原文参与
查看了一下oracle的 Data Redaction Policies 功能,结果一样。仅在数据返回给应用时,执行这个操作。
