0

Similar to this question: How do I query for all the nodes between two nodes in a tree?

But I do not have a closure (flattened) table, a child can have many parents, and ID traversal is not necessarily in order. There is no limit to the nesting depth.

Assume a circular reference is impossible... I would like to return all rows required to traverse the hierarchy.

Assume the following table:

ParentID    ID    RowNumber(Reference)
1           2     1
2           4     2
4           3     3
3           5     4
1           6     5
6           7     6
2           8     7
3           9     8
1           8     9
6           8     10

Given 1 how would I write a single query to return all the rows (get all descendants' relationsips)?

Likewise, given 2 I would expect rows 2,3,4,7,8

Given 6 I would expect rows 6 and 10

An occasional false positive is acceptable as are duplicated rows in the result. A missing row is unacceptable

Implementing in MSAccess and SQL Server 2000+

Community
  • 1
  • 1
Matthew
  • 10,244
  • 5
  • 49
  • 104
  • Just to clarify, you do mean a child can have many *parents* as opposed to many *ancestors*, right? I only ask because the sample data you posted does not show any children with multiple parents, though 4,3,5,7 have multiple ancestors. – mwolfe02 Mar 08 '11 at 19:43
  • A child can have many parents *and* many descendants. I will add more examples – Matthew Mar 08 '11 at 19:46
  • Also, a sample query result that you are hoping to achieve based on sample data would be immensely helpful. Be sure to use sample data that captures all possible complexity (ie, children with multiple parents, etc). – mwolfe02 Mar 08 '11 at 19:48
  • What is the maximum depth, if any? You can't do this in a query (SQL Server 2000), a stored procedure - maybe – RichardTheKiwi Mar 08 '11 at 19:56
  • @Richard: "There is no limit to the nesting depth" – mwolfe02 Mar 08 '11 at 20:25
  • Here's the laundry list of options on how to do this: http://stackoverflow.com/questions/4048151/what-are-the-options-for-storing-hierarchical-data-in-a-relational-database – orangepips Mar 09 '11 at 17:29

2 Answers2

3

For SQL Server: Adjacency list vs. nested sets: SQL Server

For Jet/MS Access, recursive queries are not an option, so nested sets would be the way to go. For a sample: http://www.mvps.org/access/queries/qry0023.htm

Some background on nested sets:

To implement a nested set solution you would need to add and maintain two additional columns in your table: Lt and Rt(left and right, respectively). You populate these columns by executing a modified preorder tree traversal to assign values to these columns. This can be done most easily with a recursive function. You can then use the left and right values to determine descendants at SELECT time.

The tradeoff is more processing required whenever data is changed but much faster execution when data is retrieved.

The concept is somewhat non-intuitive and certainly has a learning curve, but I have personally used it to great effect. As far as I know, it is the only way to accomplish what you are after using only SELECT queries in Jet (the MS Access db engine).

Sample Nested Set Solution:

ParentID    ID  Lt  Rt  RowNumber(Reference)
Null        1    1  18  0
1           2    2  13  1
2           4    3  10  2
4           3    4   9  3
3           5    5   6  4
1           6   14  17  5
6           7   15  16  6
2           8   11  12  7
3           9    7   8  8

Then to get all descendants of ID 2:

SELECT * FROM Tbl WHERE Lt Between 2 And 13

Here's what the tree looks like graphically:

Modified Preorder Tree

mwolfe02
  • 23,787
  • 9
  • 91
  • 161
  • 1
    Upon closer examination it looks like recursive queries are only available in SQL Server 2005+. – mwolfe02 Mar 08 '11 at 20:05
  • 1
    Please note that the first link also includes an implementation for nested sets in SQL Server that will work in SQL Server 2000. The adjacency list samples in that link will only work in SQL Server 2005+. – mwolfe02 Mar 08 '11 at 20:17
  • The graphical representation of the tree is very helpful for understanding the Rt/Lt data... but what determines whether a number will be on the left or the right? In your graphic you have 2 left of 6, 4 left of 8 left of 7 and 5 left of 9... but the tree would be traversable regardless of what side the number orient on... though the Rt/Lt values would be different. – Matthew Mar 08 '11 at 21:58
  • 1
    I also struggle with how the Lt/Rt data can be assigned when multiple parents are allowed. – Matthew Mar 08 '11 at 22:01
  • Please see my edit for a multi-parent example in which case the set is no longer completely nested. Given your drawing, assume a line connecting 1 <-> 8 and 6 <-> 8.... There is still no circular reference and the relationship is completely valid but cannot be represented as a nested set. – Matthew Mar 08 '11 at 22:09
  • 1
    The left/right order of items at the same level does not matter. For example, if you switched 2 and 6, 2 would end up with diff lt/rt values. But then you would just use those values in place of 2/13 in your query and you'd still get all of 2's descendants. – mwolfe02 Mar 08 '11 at 22:11
  • OK, that was what I was getting at in my comment to your original question when I asked you to provide sample data showing all complexity. Your original (and first revision) sample data could be modeled using a nested set/modified preorder tree traversal. – mwolfe02 Mar 08 '11 at 22:14
  • 1
    You're right... sorry for my lack of clarity. I *did* specify multiple parents though in my original post (which fall outside the scope of the nested set model. Nonetheless, I really like the approach I just don't think it is applicable here. – Matthew Mar 08 '11 at 22:16
  • You did specify multiple parents (and I did read that). I was trying to clarify how strictly you were using that technical term. I was asking specifically because I had this solution in mind. Unfortunately, it will not work in your case (it is a good one to keep in mind for when it is applicable). Anyway, I've posted a new solution that should support your specific case. I'll leave this one separate for posterity. – mwolfe02 Mar 08 '11 at 22:26
1

Since you need to model data where nodes can have multiple parents, a nested set/MPTT solution will not work. Another alternative is the use of a closure table.

You would create an additional table that held pairs of items for every ancestor's descendant (and vice versa):

AncID  DesID
  1      2
  1      6
  1      4
  1      8
  1      7
  1      3
  1      5
  1      9
  2      4
  2      8
  2      3
  2      5
  2      9
  4      3
  4      5
  4      9
  3      5
  3      9
  6      7

Then you would use a join to get the items you need:

SELECT * 
FROM Tbl INNER JOIN Closure ON Tbl.ID=Closure.DesID 
WHERE Closure.AncID = 2
Community
  • 1
  • 1
mwolfe02
  • 23,787
  • 9
  • 91
  • 161
  • I'm familiar with this method (and it was employed in the question I linked in my OP). It would require additional processing for data-writing which is not desirable. I am beginning to believe though that it is the only viable method to get the results in a single query. Note that in my question I called it a "flattened" table rather than the accepted name you've used. – Matthew Mar 08 '11 at 22:34
  • Unless you are using a db engine with proprietary support for adjacency list models (Oracle, SQL Server 2005+) you will need to do additional processing either at write-time or read-time. If you want to avoid managing additional info, you can loop through recordsets at read-time. This will be slower than using queries, but it will require less maintenance. It's a tradeoff. It just depends on your specific requirements. – mwolfe02 Mar 09 '11 at 12:21
  • Though I will not be implementing the transitive closure table I am marking this as the answer because it's a viable solution (just not the one I plan to use) – Matthew Mar 31 '11 at 19:50