HierarchyID
If you’ve been following my blog you should
remember a couple of posts about hierarchical data in SQL Server, and how
Common Table Expressions can be a great way to extract that data. If you’ve not
read them, you can find them here and here.
In SQL Server 2008 we have a brand new shiny
data type called “HierarchyID”. This data type is designed precisely for storing hierarchical
data in a compact, efficient manner that makes it easy to perform common
operations, and hence massively simplifies matters when retrieving data.
So I thought we could briefly revisit the two
scenarios I’ve mentioned before and see how SQL Server 2008 makes our lives
easier. Remember in my previous posts we had a table that looked something like
this;
CREATE TABLE dbo.Employee
(
[Id] INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
[Name] NVARCHAR(50) NOT NULL,
[ManagerId] INT NULL
)
GO
ALTER TABLE [dbo].[Employee]
WITH CHECK ADD CONSTRAINT [FK_Employee_Employee]
FOREIGN KEY([ManagerId])
REFERENCES [dbo].[Employee] ([Id])
GO
This is really simple – we just have a record
with an ID, and a foreign key to the “parent” record (the employee’s manager in
this case).
Our New Table Format
With HierarchyID, the data type stores the whole
path to the current record in the “Id” column. This means we don’t have to do
crazy recursion to work out who my boss’ boss is – it’s all there in the path
all the time. This also means I don’t need a foreign key back to another record
in the same table! Confused? Let’s see it in action.
CREATE TABLE EmployeeWithHierarchyID
(
[Id] hierarchyid not null primary key,
[Name] nvarchar(50) not null
)
GO
... and that’s it. To insert some records, a few
lines such as the following can be used;
INSERT INTO EmployeeWithHierarchyID
([Id], [Name]) VALUES
(hierarchyid::GetRoot(), 'Simon
Ince'),
('/1/', 'Someone
Else'),
('/1/1/', 'Another
Person')
GO
*
I’ve used another nice feature of SQL 2008 here – separating multiple lines of
data to insert with commas.
This “path” syntax of ‘/1/1/’ and so on means that I
will be at the root of the tree, Someone Else will report to me, and Another
Person will in turn report to them. Once I’ve filled my table with data, I can
select it out as follows;
SELECT
Id,
Id.ToString() AS [Path],
Id.GetLevel() AS [Level],
Name
FROM EmployeeWithHierarchyID
GO
I’ve used a couple of functions here on the
HierarchyID data type to demonstrate what you can do;
|
Id
|
Path
|
Level
|
Name
|
|
0x
|
/
|
0
|
Adams, Ellen
|
|
0x68
|
/2/
|
1
|
Adams, Terry
|
|
0x78
|
/3/
|
1
|
Birkby, Dana
|
|
0x7C20
|
/3/4/
|
2
|
Bishop, Scott
|
|
0x8C
|
/5/
|
1
|
Caron,
Rob
|
|
0x8E50
|
/5/6/
|
2
|
Dunker,
Andrea
|
|
0x8E59C0
|
/5/6/7/
|
3
|
Francis, Cat
|
|
0x8E5A20
|
/5/6/8/
|
3
|
Gilmore, Eric
|
|
0x8E5A60
|
/5/6/9/
|
3
|
Hamilton, David
|
|
0x8E5A7540
|
/5/6/9/10/
|
4
|
Johnson, Barry
|
What you should be able to see here is a few things;
1.
The HierarchyID field is stored in a binary format, that
isn’t human readable.
2.
The ToString function immediately shows the path that the
binary format represents. So I know for example that Andrea Dunker reports to
Rob Caron, who in turn reports to Ellen Adams... just by reading the “path”
output by the ToString function.
3. The GetLevel function calculates how many steps down the hierarchy
the current record is. It can do all this based on just the binary content of
the HierarchyID field again.
Good huh? OK, so let’s get to replicating the
queries I created in my other posts.
Scenario 1: The Easy Way Round
In the first scenario all I wanted to do was
retrieve all records for employees that reported (directly or indirectly) to a
specific employee. For example, every employee reports at least indirectly to
Ellen Adams in the data above, but only Terry Adams and, Dana Birkby, and Rob
Caron report directly to her.
It turns out this query is unbelievably simple;
SELECT
Id,
Id.ToString() AS [Path],
Id.GetLevel() AS [Level],
Name
FROM EmployeeWithHierarchyID
WHERE Id.IsDescendantOf('/5/') = 1
GO
In this case, I select everyone that reports to
Rob Caron. And that’s it! The IsDescendantOf function handles the rest for you
– and of course it is very efficient because all it needs is the path
information encoded in the Id field.
Of course, here I’ve used a hard coded path
value as the parameter to IsDescendantOf, but I could equally have selected a
value from the database based on a name, or passed one from somewhere else.
Scenario 2: Reversed!
The next scenario was how to select the entire
management chain of an employee. So given a junior staff member, how do I
select his boss, his boss’ boss, his boss, and so on? This is a little more
complex, as we need to walk the stack going back up the chain.
At first thought, I assumed I’d need another
Common Table Expression; perhaps something like this?
WITH Ancestors(Id, [Name], AncestorId)
AS
(
SELECT
Id,
[Name],
Id.GetAncestor(1)
FROM
EmployeeWithHierarchyId
WHERE
Id = '/5/6/9/10/'
UNION ALL
SELECT
e.Id,
e.[Name],
e.Id.GetAncestor(1)
FROM
EmployeeWithHierarchyID
e
INNER JOIN Ancestors a
ON e.Id = a.AncestorId
)
SELECT *, Id.ToString() FROM Ancestors
GO
All you need to do is replace/parameterise the ‘/5/6/9/10/’ path that I’ve used to
select the employee we’re interested in. This works fine, but is it the optimum
way of achieving it?
Nope. Let’s try again!
DECLARE @TheEmployee hierarchyid
SELECT @TheEmployee = Id
FROM EmployeeWithHierarchyID
WHERE [Name] = 'Johnson, Barry';
SELECT
Id,
Id.ToString() AS [Path],
Id.GetLevel() AS [Level],
Id.GetAncestor(1),
Name
FROM EmployeeWithHierarchyID
WHERE @TheEmployee.IsDescendantOf(Id) = 1
GO
This time I’ve selected the employee by name,
just so I can get hold of an instance of a HierarchyID data type that relates
to him. This means I can perform operations on it just like I have done on
columns in the table before. So next, I just reverse the order of my
IsDescendantOf call in the WHERE clause – now I say “Is Barry Johnson a
descendant of the current record?”. If he is, we include it in the result set.
Simple huh?
Summary
Well, this has been a very quick post just to get you started with
HierarchyID, but I hope it has made it obvious just how much easier and more efficient it can be for most hierarchies. Enjoy!
No comments:
Post a Comment