1

I wanted to ask if there is a way to better handle hierarchical data in mysql than what I have set up currently.

In my mysql database I am storing a shallow hierarchy for a forum:

Titles(0) > Boards(1) > Threads(2) > Comments(3) type(0, 1, 2, 3)

The hierarchy is represented as follows:

Table 1: posts.title, posts.id, posts.type
post.title(title is the data within the post)
post.id(id is the primary key of the post)
post.type((what level)/(how deep) the post is in the hierarchy)

Table 2: parent_tree=(pt) pt.child, pt.parent
pt.child (primary key of a post)
pt.parent (primary key of post which is a parent of the child post)

Each post has a self referential parent_tree entry (ex: creating new post which has id=7 results in parent_tree a new entry where child is 7 and parent is 7:

Each post upon being created also copies all PARENTS of the parent it is being created under (which would also include its self reference).

Inserting (POST) is simple. Upon a new post being created, two queries are made - one for inserting the new post, the second for inserting its parent_tree counterpart, which sets child as a the foreign key (with ON DELETE CASCADE).

Getting (GET) is simple, a user can select a post by id/title and type.

Deleting (DELETE) is simple, a post selected by id/title and type is deleted (and thus its foreign key parent_tree entry).

Moving (PUT) is the most complicated of the bunch, it consists of two queries.
In this action, a post is selected (title/id) and a (parent_title/parent_id) to specify what parent it is being moved to be under.

First query - select all parent_tree entries where parent id = the id of the post about to be moved, select all parent_tree entries where child is the result of the previous selection, which returns all children of the post to be moved as well as all of their parents. Then, from this selection delete all parents, child pairs where the parent is also a parent of the selected post

(short) delete entries in parent_tree of post and its children where the parent has the post as a child.

Second query - Insert into the children of the post that is being moved, and the post itself, all parents of the new parent (which includes a self reference) the selected post is being appended under.

Here is how i do my queries in mysql for the moving (completely open to suggestions, I am relatively new to mysql and learning).

First query:

DELETE t FROM parent_tree AS t INNER JOIN parent_tree AS c ON t.child=c.child INNER JOIN parent_tree AS p ON t.parent=p.parent WHERE c.parent IN (SELECT id FROM posts WHERE title=?) AND p.child<>p.parent AND p.child IN (SELECT id FROM posts WHERE title=?)

?[0,1] = title_of_post_being_moved

Second query:

INSERT INTO parent_tree(child,parent) (SELECT t.child, p.parent FROM parent_tree AS t INNER JOIN parent_tree AS p WHERE t.parent IN (SELECT id FROM posts WHERE title=?) AND p.child IN (SELECT id FROM posts WHERE title=?) AND t.child<>p.parent)

?[0] = title_of_post_being_moved, ?[1] = title_of_new_parent_post

CKE
  • 1,533
  • 19
  • 18
  • 29
Kotz
  • 89
  • 1
  • 8
  • 1
    Your data structure is unclear, maybe provide some sample data. Moving a post x and all its children in an adjacency list should just change one value (x.parent=newparent.id), as children y of x have y.parent=x.id (and don't care about the parent of x). Are you trying some form of closure table? Anyway, your biggest concern should be how efficiently you can select the tree (sorted, filtered, ...), as that's what happens 99% of the time, while (effort for) moving a post is usually irrelevant. For an overview of different tree implementations see [here](https://stackoverflow.com/q/20215744). – Solarflare Jul 08 '18 at 09:44
  • Thank you, that answers my question. I looked up a closure table and it's what I have created. Looking through the thread you linked and reconsidering that get and post are easily going to be the most common actions on the server, I'll go ahead and move back to simply having a direct parent reference on each child. With this, the delete statement gets a lot less clean since I have to iterate parent by parent or dynamically combine mysql queries(lest i perhaps create a path field and select by using a wildcard), but as you said, it shouldn't matter much since deleting will not be done often. – Kotz Jul 08 '18 at 11:34
  • 1
    I actually didn't want to suggest to change your tree model. Your tree type was just unclear - the 2 columns of "table 2" are an adjacency list and cannot be a closure table (I think), while the description of "move" sounded like a closure table, which confused me. You can use both (or other) trees, but the codelength to "move" isn't relevant for that decision: selecting is. The adjacency list is usually the hardest one to do selects in (e.g. getting, counting or sorting all children), so, again, I didn't suggest to switch; a closure table is probably fine (or check the link for alternatives). – Solarflare Jul 08 '18 at 12:15
  • 1
    And another thing that's not really clear, and where sample data would help a lot to asset it: you may not even need a hierarchy here. You probably at least do not need a tree hierarchy between comments and e.g. boards (as a comment can never become a board), which might make your current implementation overly complicated. You need a hierarchy if you e.g. want to implement something like reddit, where you can comment on specific comments (so if you don't just add it to the end of a list of comments to post x), but if not, you may not need one here either. – Solarflare Jul 08 '18 at 12:27

0 Answers0