2

In my ruby-on-rails app, I have nested comments that can be nested an arbitrary length.

I tried different ways of storing this:

Using self joins:

belongs_to :parent, :class_name => 'Comment', :foreign_key => 'parent_id'
has_many :children, :class_name => 'Comment', :foreign_key => "parent_id"

Using ancestry gem

etc

The problem, though, is that no matter what I use, there will always be an linear number of SQL statements. (1 statement to grab all the root comments, and then 1 statement for each root's children, and then 1 statement for all the children of that, etc)

Is there a more efficient way to accomplish this?

Postgres 9.1, but hopefully backwards compatible solutions are preferred.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
Razor Storm
  • 12,167
  • 20
  • 88
  • 148
  • There's a vast difference between "arbitrary" and "infinite" :-) – paxdiablo Jan 06 '12 at 01:56
  • Point noted. Though isn't it rather pedantic? Clearly, there is no possible way to store an infinitely nested structure :P – Razor Storm Jan 06 '12 at 01:58
  • 1
    Pedantry is my speciality. Well, that and anal retentiveness :-) – paxdiablo Jan 06 '12 at 02:08
  • postgres 9.1, but backwards compatible solutions are great too – Razor Storm Jan 06 '12 at 02:13
  • possible duplicate of [What is the most efficient/elegant way to parse a flat table into a tree?](http://stackoverflow.com/questions/192220/what-is-the-most-efficient-elegant-way-to-parse-a-flat-table-into-a-tree) – Bill Karwin Jan 06 '12 at 02:21
  • @RazorStorm could you define what you mean by "efficient" (space or speed)? It would also help to know how much data is stored in each of the nodes. Representing and walking the graph separately from the leaf node data might be your best bet if it's large and frequently traversed. – Nick Jan 06 '12 at 02:31
  • I am hoping for speed efficiency. At the moment, I am grabbing the first layer of comments from db, then grabbing their children, then their children, etc. As I do this, I put this data into a tree. On the view, I do a preorder traversal of the tree to render each comment. – Razor Storm Jan 06 '12 at 02:35
  • Any reason for not using an existing plugin/gem (e.g. [ancestry](https://github.com/stefankroes/ancestry))? – Nick Jan 06 '12 at 03:57
  • I tried with ancestry, (as I stated above), but it tends to create a huge amount of sql statements. I was wondering if it is possible to make this more efficient speedwise – Razor Storm Jan 06 '12 at 06:01

3 Answers3

3

You could stick with your parent_id pointer column and use find_by_sql and a WITH RECURSIVE query and let the database do all the work in one shot. Something like this:

comments = Comment.find_by_sql(%Q{
    with recursive tree(id) as (
        select c.id, c.column1, ...
        from comments c
        where c.id in (#{roots.join(',')})
        union all
        select c.id, c.column1, ...
        from comments c
        join tree on c.parent_id = tree.id
    )
    select id, column1, ...
    from tree
})

where roots would be a Ruby array holding the ids of the root nodes that you're interested in. That will give you all the nodes in the subtrees of interest as Comment instances. I've used queries like this in the past and WITH RECURSIVE was well over twice as fast as your iterative technique even with shallow trees, I'd guess that deeper trees would see even better speed ups.

The parent_id structure you're using is very convenient for most things and meshes quite well with how ActiveRecord wants to work. Also, sticking with your current structure means that you can leave the rest of your application alone.

WITH RECURSIVE is available in PostgreSQL 8.4 and higher.

mu is too short
  • 426,620
  • 70
  • 833
  • 800
  • Doing this seems to present me with a flat list. Do I have to recreate the tree structure (for rendering) algorithmically or is there a shortcut? – Razor Storm Jan 07 '12 at 21:18
  • 1
    @Razor: That will pull out the tree out of the database in one go, then you have to put it back together in Ruby. If your goal is just to render the tree then a materialized path approach (using PostgreSQL arrays for the paths) would work. – mu is too short Jan 07 '12 at 21:38
  • 1
    @muistooshort postgreSQL now has separate datatype to support materialised path named ltree and this is one of the gem which uses it and gives helper functions https://github.com/cfabianski/ltree_hierarchy – skam Jul 05 '17 at 09:11
1

Take a look a awesome_nested_set, I think you will like it.

https://github.com/collectiveidea/awesome_nested_set

0

There are a number of articles that discuss the classical (i.e. basic SQL based) options. For instance, this is a decent summary of the classical options and this article covers using recursive queries with rails.

However, since you are using Postgressql (and if not start) you have the option of using the ltree extension which creates indexes specifically suited for tree structures. I would have used ltree_hierarchy but it turns out that postgres doesn't support dashes in the ltree labels making this incompatible with using uuids for my ids. I settled on the ancestry gem despite it not using the nice features (only other real choice in 2021 is awesome_nested_set which has some documentation about how to "create a text_pattern_ops index for your postgresql column."

Peter Gerdes
  • 2,288
  • 1
  • 20
  • 28