5

I'm having some trouble doing a "tree-like" query (what do we call this?) in SQL.

Take a look at my diagram below (table and column names are in danish - sorry about that):

DB diagram http://img197.imageshack.us/img197/8721/44060572.jpg Using MSSQL Server 2005, the goal is to find the most parent group (Gruppe), for each customer (Kunde).

Each group can have many parent groups and many child groups.

And, I would also like to know how to display the tree like this:

Customer 1
   - Parent group 1
      - Child group 1
         - ChildChild group n
      - Child group n
   - Parent group n
      - ...
         - ...
Customer n
   - ...

Another question:

How does the query look to get ALL the groups for all the customers? Parent and child groups.

Tommy Jakobsen
  • 2,323
  • 6
  • 39
  • 66
  • 1
    I believe a common name for this type of data is "hierarchical" and you want the "top ancestor." You can use recursive queries (see http://www.codeproject.com/KB/architecture/RoleBasedSecurity.aspx) to do this. – Blixt Jul 16 '09 at 08:05

6 Answers6

5

You can use CTE's to construct "the full path" column on the fly

--DROP TABLE Gruppe, Kunde, Gruppe_Gruppe, Kunde_Gruppe
CREATE TABLE Gruppe (
    Id                  INT PRIMARY KEY
    , Name              VARCHAR(100)
    )
CREATE TABLE Kunde (
    Id                  INT PRIMARY KEY
    , Name              VARCHAR(100)
    )
CREATE TABLE Gruppe_Gruppe (
    ParentGruppeId      INT
    , ChildGruppeId     INT
    )
CREATE TABLE Kunde_Gruppe (
    KundeId             INT
    , GruppeId          INT
    )

INSERT      Gruppe
VALUES      (1, 'Group 1'), (2, 'Group 2'), (3, 'Group 3')
            , (4, 'Sub-group A'), (5, 'Sub-group B'), (6, 'Sub-group C'), (7, 'Sub-group D')

INSERT      Kunde
VALUES      (1, 'Kunde 1'), (2, 'Kunde 2'), (3, 'Kunde 3')

INSERT      Gruppe_Gruppe
VALUES      (1, 4), (1, 5), (1, 7)
            , (2, 6), (2, 7)
            , (6, 1)

INSERT      Kunde_Gruppe
VALUES      (1, 1), (1, 2)
            , (2, 3), (2, 4)

;WITH       CTE
AS          (
            SELECT      CONVERT(VARCHAR(1000), REPLACE(CONVERT(CHAR(5), k.Id), ' ', 'K')) AS TheKey
                        , k.Name        AS Name
            FROM        Kunde k

            UNION ALL

            SELECT      CONVERT(VARCHAR(1000), REPLACE(CONVERT(CHAR(5), x.KundeId), ' ', 'K')
                             + REPLACE(CONVERT(CHAR(5), g.Id), ' ', 'G')) AS TheKey
                        , g.Name
            FROM        Gruppe g
            JOIN        Kunde_Gruppe x
            ON          g.Id = x.GruppeId

            UNION ALL

            SELECT      CONVERT(VARCHAR(1000), p.TheKey + REPLACE(CONVERT(CHAR(5), g.Id), ' ', 'G')) AS TheKey
                        , g.Name
            FROM        Gruppe g
            JOIN        Gruppe_Gruppe x
            ON          g.Id = x.ChildGruppeId
            JOIN        CTE p
            ON          REPLACE(CONVERT(CHAR(5), x.ParentGruppeId), ' ', 'G') = RIGHT(p.TheKey, 5)
            WHERE       LEN(p.TheKey) < 32 * 5
            )
SELECT      *
            , LEN(TheKey) / 5 AS Level
FROM        CTE c
ORDER BY    c.TheKey

Performance might be sub-optimal if you have lots of reads vs rare modifications.

wqw
  • 11,771
  • 1
  • 33
  • 41
4

I just can't say it better than Joe Celko. The problem is usually that the models built doesn't lend themselves well to build hierarchies, and that those models have to take in consideration the characteristics of your hierarchy. Is it too deep? Is it too wide? Is it narrow and shallow?

One key to success on wide and shallow trees is to have the full path in the hierarchy in a column, like Celko mentions in the first link.

Vinko Vrsalovic
  • 330,807
  • 53
  • 334
  • 373
3

I came up with a solution that solves the problem of listing ALL the groups for each customer. Parent and child groups.

What do you think?

WITH GroupTree
AS
(
    SELECT kg.KundeId, g.Id GruppeId
    FROM ActiveDirectory.Gruppe g
    INNER JOIN ActiveDirectory.Kunde_Gruppe kg ON g.Id = kg.GruppeId
    AND (EXISTS (SELECT * FROM ActiveDirectory.Gruppe_Gruppe WHERE ParentGruppeId = g.Id)
    OR NOT EXISTS (SELECT * FROM ActiveDirectory.Gruppe_Gruppe WHERE ParentGruppeId = g.Id))

    UNION ALL

    SELECT GroupTree.KundeId, gg.ChildGruppeId
    FROM ActiveDirectory.Gruppe_Gruppe gg
    INNER JOIN GroupTree ON gg.ParentGruppeId = GroupTree.GruppeId
)
SELECT KundeId, GruppeId
FROM GroupTree

OPTION (MAXRECURSION 32767)
Tommy Jakobsen
  • 2,323
  • 6
  • 39
  • 66
2

How about something like this:

DECLARE @Customer TABLE(
        CustomerID INT IDENTITY(1,1),
        CustomerName VARCHAR(MAX)
)

INSERT INTO @Customer SELECT 'Customer1'
INSERT INTO @Customer SELECT 'Customer2'
INSERT INTO @Customer SELECT 'Customer3'

DECLARE @CustomerTreeStructure TABLE(
        CustomerID INT,
        TreeItemID INT
)

INSERT INTO @CustomerTreeStructure (CustomerID,TreeItemID) SELECT 1, 1
INSERT INTO @CustomerTreeStructure (CustomerID,TreeItemID) SELECT 2, 12
INSERT INTO @CustomerTreeStructure (CustomerID,TreeItemID) SELECT 3, 1
INSERT INTO @CustomerTreeStructure (CustomerID,TreeItemID) SELECT 3, 12

DECLARE @TreeStructure TABLE(
        TreeItemID INT IDENTITY(1,1),
        TreeItemName VARCHAR(MAX),
        TreeParentID INT
)

INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '001', NULL
INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '001.001', 1
INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '001.001.001', 2
INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '001.001.002', 2
INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '001.001.003', 2
INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '001.002', 1
INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '001.003', 1
INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '001.003.001', 7
INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '001.001.002.001', 4
INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '001.001.002.002', 4
INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '001.001.002.003', 4

INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '002', NULL
INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '002.001', 12
INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '002.001.001', 13
INSERT INTO @TreeStructure (TreeItemName,TreeParentID) SELECT '002.001.002', 13

;WITH Structure AS (
    SELECT  TreeItemID,
            TreeItemName,
            TreeParentID,
            REPLICATE('0',5 - LEN(CAST(TreeItemID AS VARCHAR(MAX)))) + CAST(TreeItemID AS VARCHAR(MAX)) + '\\' TreePath
    FROM    @TreeStructure ts
    WHERE   ts.TreeParentID IS NULL
    UNION ALL
    SELECT  ts.*,
            s.TreePath + REPLICATE('0',5 - LEN(CAST(ts.TreeItemID AS VARCHAR(5)))) + CAST(ts.TreeItemID AS VARCHAR(5)) + '\\' TreePath
    FROM    @TreeStructure ts INNER JOIN
            Structure s ON ts.TreeParentID = s.TreeItemID
)

SELECT  c.CustomerName,
        Children.TreeItemName,
        Children.TreePath
FROM    @Customer c INNER JOIN
        @CustomerTreeStructure cts ON c.CustomerID = cts.CustomerID INNER JOIN
        Structure s ON cts.TreeItemID = s.TreeItemID INNER JOIN
        (
            SELECT  *
            FROM    Structure
        ) Children ON Children.TreePath LIKE s.TreePath +'%'
ORDER BY 1,3
OPTION (MAXRECURSION 0)
Adriaan Stander
  • 162,879
  • 31
  • 289
  • 284
1

In T-SQL, you can write a while loop. Untested:

@group = <starting group>
WHILE (EXISTS(SELECT * FROM Gruppe_Gruppe WHERE ChildGruppeId=@group))
BEGIN
  SELECT @group=ParentGruppeId FROM Gruppe_Gruppe WHERE ChildGruppeId=@group
END
Martin v. Löwis
  • 124,830
  • 17
  • 198
  • 235
1

We use SQL Server 2000 and there is an example of expanding hierarchies using a stack in the SQL Books Online, I have written a number of variants for our ERP system

http://support.microsoft.com/kb/248915

I gather that there is a Native method using CTE within SQL 2005 but I have not used it myself