0

Okay, so I'm working on a project with hierarchical data that I'm using for a book-writing app like so:

  • top-level parent (Act) - contains Act name, position (first Act), description, and text (intro text)
  • mid-level parent / child (Chapter) - contains Chapter name, position (first Chapter), description, and text (intro text)
  • bottom-level (Section) - contains Section name, position (first Section), description, and text (actual content text)

Now, I want to have a variable number of levels (for example include sub-sections that will have the full text), but I'm not entirely sure how to create a table like this efficiently. My initial thought was to have them connected by parentId with top-level having a null for parentId.

For example, if I want to call up a top-level parent in the first position, it's not a big deal. Right now, I can search for nulled parent fields and for a position of 1. To call up a "chapter" (mid-level), I do the same thing but get the id of the result and use it as a parentid.

The problem is for section, I'd have to have several sub-queries to get to the final result. If I want to ahve variable number of levels, I'd require numerous sub-queries which will be a performance hit.

I saw a similar question but did not really understand the answer or how i could use it.

I already considered some kind of taxonomy table or carrying over all parent ids into the children (ie. a section will have both chapter and act ids listed under parentid) but I'm not set on it. Feedbooks.com uses a similar hierarchy when submitting a book but they don't store data in a database, they just take an input and convert it to an output (a pdf, epub or whatever).

Oh and, I plan on building this in MySQL.

Ideas?

EDIT An easier way of imagining this scenario is with a family. Let's say you have 3 grandfathers (Bill, Bernard, Boe), who have 5 children between them (John, Joey, Josh, Jeremy, Jackson), who, in turn have all 2 children of their own (examples: Donald, Duey, Donnie). And let's say the database does not store THAT relationship but rather the relationship of what child is staying where and we don't care about the biology here.

So let's say Donnie started out living with John who lives with Bill. Donnie is John's first child and John is Bill's second child. How do you query Bill's first grandchild of the second son?

Let's say they move around and now Donald is staying with John instead. Donald is John's second child (the first child position was filled with someone else) and John is still Bill's second child. How do you query Bill's second grandchild of the second son?

What if John moves with all his children to Boe's house. Boe is the second grandfather. How do you query this information now? How would you store this type of information?

What if you throw great-grandchildren into the mix now?

Community
  • 1
  • 1
antjanus
  • 987
  • 3
  • 15
  • 30

1 Answers1

1

This is a good example to use closure tables, as on the answer you are referencing. You may want to take a look here and here.

Quoting from the first link:

The Closure Table is a design for representing trees in a relational database by storing all the paths between tree nodes.

Yaroslav
  • 6,476
  • 10
  • 48
  • 89
  • These two articles seem to explain the solution much better but I'm still confused. I understand how Closure table is made and how you can use it to easily query grandchildren of a grandfather (much easier than what I was doing) but it still doesn't solve my problem entirely. Say you want to query the grandchildren of the first child of the grandfather. I'm trying to wrap my head around it. Would I then select all of the grandchildren using the closure table, and then run a "where" subquery for parentID for a first parent? It simplifies my problem but I wonder if it's the correct solution. – antjanus Aug 03 '12 at 15:21