Storing Hierarchical data by 【Nested Sets】

如何在数据库中存储层次结构

常见场景

  1. 公司:公司 - 部门 - 子部门
  2. 人员:领导 - 员工
  3. 文件:根目录 - 文件夹 - 文件
  4. 关系:group - child

实例

转成树型

回头看

之前提到的几种方案(Adjacency_List, Path_Enumerations, Closure_Table)都能够一定程度地满足需求,但是各自具有不可避免的弊端。
Adjacency_List: 层级查询 -> ∞
Path_Enumerations: 深度限制
Closure_Table: 空间消耗大,层级删改 -> ∞

Nested Sets

根据树的深度遍历对节点编号,记录首、末次访问到该节点的数字。通过比较数字或的层级结构关系。

idnameleftright
1DIR_A130
2DIR_B211
3DIR_C1213
4file_x2425
5file_y2627
6file_z2829
7DIR_E36
8file_o78
9file_p910
10DIR_D1419
11file_m2021
12file_n2223
13file_l45
14file_j1516
15file_k1718

各种情况的处理代价

代价:-> O(n),更新会影响到其他子树
输入:name. parent_id 执行:

1
2
3
4
5
6
parent = $(select * from table where id = $id);

insert into table(name, left, right) values($name, $parent.right, $parent.right + 1);

update table set left = left + 2 where left >= $parent.right;
update table set right = right + 2 where right >= $parent.right;

代价:-> O(n),先删除节点以及子树,并对其他子树进行修改
输入:id
执行:

1
2
3
4
5
6
7
current = $(select * from table where id = $id);

delete from table where left >= $current.left and right <= $current.right;

d = current.right - current.right + 1;
update table set left = left - $d where left > $current.right;
update table set right = right - $d where right > $current.right;

代价:-> O(1)
id, other info
执行:

1
update table set info where id = $id;

查自己

代价:-> O(1)
输入:id
执行:

1
select * from table where id = $id
查下一级

代价:-> O(n)
输入:id
执行:

1
2
3
4
5
6
7
select distinct Child.Node, Child.Left, Child.Right
from table as child, table as parent
where parent.left < child.left and parent.right > child.right  -- associate Child Nodes with ancestors
group by child.node, child.left, child.right
having max(parent.left) = $parent.left  -- Subset for those with the given Parent Node as the nearest ancestor

这类查询可以通过增加一列来简化。例如,增加depth列记录当前节点深度,或者parent_id列记录父节点(和Adjacency List混用,但增加了维护成本
查所有子集

代价:-> O(n)
输入:path
执行:

1
2
3
4
5
select * 
from table 
where left > $parent.left 
and right < $parent.right 
order by left asc;

移动

代价:-> O(n)
输入:id, new_parent_id 执行:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

## 先执行 DEL,再执行 ADD

current = $(
    select * 
    from table 
    where id = $id
);

## 查询父节点
old_parent = $(
    select * from table 
    where left < $current.left 
    and right > $current.right
    order by left desc
    limit 1
);

-- 查询移动节点以及子节点的总数
node_count = $(
    select count(*) 
    from table
    where left >= $current.left 
    and right <= $current.right
);


update table 
set left = left - node_count*2, 
where left > $current.right;

update table 
set right = right - node_count*2
where right > $current.right;

## 更新子节点
update table 
set left = left + ($new_parent.right - $current.left), 
    right = right + ($new_parent.right - $current.left)
where left >= $current.left 
and right <= $current.right

$current.left = $new_parent.right;
$current.right = $new_parent.right + 1;
update table 
set left = left + node_count*2,
where left > $current.right;

update table 
set right = right + node_count*2
where right > $current.right;

总结

可以和邻接表同时使用
优点 : 消除递归操作,实现无限分组
缺点 : 增 删 改影响范围大
适用 : 强要求无限层级深度

比较

T查自己查下一级查所有子集移动适用
Adjacency ListO(1)O(1)O(1)O(n)O(n)O(1)不涉及“所有子集”操作,严格按照层级一层层地查询
Closure TableO(n)O(n)O(1)O(1)O(n)O(n)O(n)有固定的层级深度,并且层级不多
PathO(1)O(n)O(n)O(1)O(n)O(n)O(n)对层级深度有一定限制,并且需求对所有子集进行操作
Netsted SetsO(n)O(n)O(1)O(1)O(n)O(n)O(n)强要求无限层级深度
comments powered by Disqus