Tag Archive for: tree

Hierarchies in SQL Server 2008

Graphs, tree algorithms and structures have been used for long time in databases to solve hierarchy related problems. Adjacency list, nested sets, materialized path, and other hybrid methods offer different capabilities to help.

SQL Server 2008 adds a new feature to help with modeling hierarchical relationships: the HIERARCHYID data type. It provides compact storage and convenient methods to manipulate hierarchies. In a way it is very much like optimized materialized path. In addition the SqlHierarchyId CLR data type is available for client applications.

While HIERARCHYID has a lot to offer in terms of operations with hierarchical data, it is important to understand a few basic concepts:

– HIERARCHYID can have only a single root (although easy to work around by adding sub-roots)
– It does not automatically represent a tree, the application has to define the relationships and enforce all rules
– The application needs to maintain the consistency

Here is a one example of employee hierarchy to illustrate the usage of HIERARCHYID and the related methods for manipulation of hierarchies.

CREATE TABLE Employees (

 emp_id INT NOT NULL PRIMARY KEY,

 emp_name VARCHAR(35),

 manager_id INT REFERENCES Employees(emp_id),

 org_chart_path HIERARCHYID);

 

/*

 

The primary key prevents cyclic paths.

Another way is using a CHECK constraint.

 

CHECK (org_chart_path.ToString() NOT LIKE '%/' + CAST(emp_id AS VARCHAR(10)) + '/_%')

 

*/

 

-- Insert the top level manager as hierarchy root

INSERT INTO Employees

VALUES (1, 'Jeff Brown', NULL, hierarchyid::GetRoot());

 

-- Insert John who reports to the top level manager                           

INSERT INTO Employees

VALUES (2, 'John Doe', 1,

           (SELECT hierarchyid::GetRoot().GetDescendant(NULL, NULL)

            FROM Employees));

 

 

-- Insert Peter at the same level as John

DECLARE @mgr HIERARCHYID = (SELECT org_chart_path

                            FROM Employees

                            WHERE emp_name = 'John Doe');

 

INSERT INTO Employees

VALUES (3, 'Peter Hanna', 1,

           (SELECT hierarchyid::GetRoot().GetDescendant(@mgr, NULL)

            FROM Employees

            WHERE org_chart_path = hierarchyid::GetRoot()));

 

-- Insert Richard as reporting to John

INSERT INTO Employees

VALUES (4, 'Richard Burns', 2,

           hierarchyid::Parse('/1/1/')); -- Also: CAST('/1/1/' AS HIERARCHYID)

 

SELECT emp_id, emp_name,

       manager_id, org_chart_path,

       org_chart_path.GetAncestor(1) AS emp_manager,

       hierarchyid::GetRoot() AS top_manager,

       org_chart_path.GetDescendant(NULL, NULL) AS emp_descendant,

       org_chart_path.GetLevel() AS emp_level,

       org_chart_path.ToString() AS emp_org_path

FROM Employees;

 

/*

 

emp_id emp_name         manager_id org_chart_path emp_manager   top_manager emp_descendant emp_level emp_org_path

------ ---------------- ----------- --------------- ------------- ------------ --------------- --------- ------------

1     Jeff Brown       NULL        0x             NULL         0x           0x58            0         /

2     John Doe         1           0x58            0x            0x           0x5AC0         1         /1/

3     Peter Hanna     1           0x68            0x            0x           0x6AC0         1         /2/

4     Richard Burns    2           0x5AC0         0x58         0x           0x5AD6         2         /1/1/

 

*/

 

-- Move Richard to report to Peter

DECLARE @new_mgr HIERARCHYID = (SELECT org_chart_path

                                FROM Employees

                                WHERE emp_name = 'Peter Hanna');

 

UPDATE Employees

SET org_chart_path = org_chart_path.Reparent(org_chart_path.GetAncestor(1),

                                            @new_mgr)

WHERE emp_name = 'Richard Burns';

 

SELECT emp_id, emp_name,

       manager_id, org_chart_path,

       org_chart_path.GetAncestor(1) AS emp_manager,

       hierarchyid::GetRoot() AS top_manager,

       org_chart_path.GetDescendant(NULL, NULL) AS emp_descendant,

       org_chart_path.GetLevel() AS emp_level,

       org_chart_path.ToString() AS emp_org_path

FROM Employees;   

 

/*

 

emp_id emp_name        manager_id org_chart_path    emp_manager   top_manager emp_descendant   emp_level emp_org_path

------- --------------- ----------- ----------------- ------------- ------------ ---------------- --------- ------------

1       Jeff Brown     NULL        0x                NULL         0x           0x58             0         /

2       John Doe        1           0x58             0x            0x           0x5AC0           1         /1/

3       Peter Hanna     1           0x68             0x            0x           0x6AC0           1         /2/

4       Richard Burns   2           0x6AC0            0x68         0x           0x6AD6           2         /2/1/

 

*/

From the above example it is very easy to see the similarity between materialized path and HIERARCHYID when the HIERARCHYID is converted to the character format using the ToString() method. Converting hierarchy from traditional parent/child format to HIERARCHYID is simple using recursive CTEs (very similar to building a materialized path).

Note:

This code has been tested with SQL Server 2008 CTP 6 (February 2008). As of SQL Server 2008 Release Candidate 0 (June 2008) the method Reparent() has been replaced with the method GetReparentedValue(). It is called using the same parameters and returns the same value.

Additional resources:

Using hierarchyid Data Types
http://msdn.microsoft.com/en-us/library/bb677173(SQL.100).aspx

Working with hierarchyid Data
http://msdn.microsoft.com/en-us/library/bb677212(SQL.100).aspx

Hierarchies with CTEs

Common table expressions (CTEs) have many applications. However, one of their capabilities to implement recursive queries is very useful for navigating and manipulating hierarchies.

Here is one brief example of utilizing that. Given a table with employees and their managers represented as adjacency list, provide list of employees that report to particular manager, ordered by the natural hierarchy order of listing each employee under the corresponding manager.

-- Create sample table with employees and managers

CREATE TABLE Employees(

 employee_nbr INT NOT NULL PRIMARY KEY,

 employee_name VARCHAR(35),

 manager_nbr INT NULL REFERENCES Employees(employee_nbr));

 

INSERT INTO Employees VALUES (1, 'John Doe', NULL);

INSERT INTO Employees VALUES (2, 'James Brown', 1);

INSERT INTO Employees VALUES (3, 'Keith Green', NULL);

INSERT INTO Employees VALUES (4, 'Peter Roth', 2);

INSERT INTO Employees VALUES (5, 'Hans Gruber', 2);

INSERT INTO Employees VALUES (6, 'Kris Evans', 4);

INSERT INTO Employees VALUES (7, 'Jeff Colleman', NULL);

 

-- Use recursive CTE to build binary and charachter paths

WITH EmployeeHierarchy

AS

(SELECT employee_nbr, employee_name, manager_nbr,

        CAST(employee_nbr AS VARBINARY(MAX)) AS bpath,

        CAST('.' +

            CAST(employee_nbr AS VARCHAR(4)) +

            '.' AS VARCHAR(MAX)) AS cpath

 FROM Employees

 WHERE manager_nbr IS NULL

 UNION ALL

 SELECT E.employee_nbr, E.employee_name, E.manager_nbr,

        H.bpath + CAST(E.employee_nbr AS BINARY(4)),

        H.cpath + CAST(E.employee_nbr AS VARCHAR(4)) + '.'

 FROM Employees AS E

 JOIN EmployeeHierarchy AS H

   ON E.manager_nbr = H.employee_nbr)

SELECT employee_nbr, employee_name, manager_nbr, bpath, cpath

FROM EmployeeHierarchy

WHERE cpath LIKE '%.2.%' -- filter all employees for manager 2

ORDER BY bpath;          -- order by natural hierarchy path

 

-- Results

employee_nbr employee_name  manager_nbr bpath                              cpath

------------ -------------- ----------- ---------------------------------- ----------

2            James Brown    1           0x0000000100000002                .1.2.

4            Peter Roth    2           0x000000010000000200000004        .1.2.4.

6            Kris Evans    4           0x00000001000000020000000400000006 .1.2.4.6.

5            Hans Gruber    2           0x000000010000000200000005        .1.2.5.

The method above creates two manager/employee paths: a binary path that preserves the natural ordering at each level, which can be used to sort; a character path that can be used to filter by manager.

Convert Tree Structure From Nested Set Into Adjacency List

Tree structures are often represented in nested set model or adjacency list model. In the nested set model each node has a left and right, where the root will always have a 1 in its left column and twice the number of nodes in its right column. On the other side the adjacency list model uses a linking column (child/parent) to handle hierarchies.

Sometimes there is a need to convert a nested set model into an adjacency list model. Here is one example of doing that:

CREATE TABLE NestedSet (

 node CHAR(1) NOT NULL PRIMARY KEY,

 lf INT NOT NULL,

 rg INT NOT NULL);

 

INSERT INTO NestedSet VALUES ('A', 1, 8);

INSERT INTO NestedSet VALUES ('B', 2, 3);

INSERT INTO NestedSet VALUES ('C', 4, 7);

INSERT INTO NestedSet VALUES ('D', 5, 6);

 

CREATE TABLE AdjacencyList (

 node CHAR(1) NOT NULL PRIMARY KEY,

 parent CHAR(1) NULL);

 

INSERT INTO AdjacencyList

SELECT A.node,

       B.node AS parent

FROM NestedSet AS A

LEFT OUTER JOIN NestedSet AS B

  ON B.lf = (SELECT MAX(C.lf)

            FROM NestedSet AS C

            WHERE A.lf > C.lf

               AND A.lf < C.rg);


-- Results
node parent
------ --------
A NULL
B A
C A
D C

Additional resources:

Book: “Trees and Hierarchies in SQL for Smarties” by Joe Celko

Adjacency List Model
http://www.sqlsummit.com/AdjacencyList.htm

Trees in SQL
http://www.intelligententerprise.com/001020/celko.jhtml?_requestid=1266295