2

I have a table of "posts" and "comments". Posts and comments have a post_id, but comments also have a parent_id that refers to another post in the same table.

I'm trying to find the most efficient way to pull the N most recent posts with their K most recent comments. The ordering I'm looking for is something like:

Post 1, Comment 1 (Parent 1), Comment 2 (Parent 1), .., Post 2, Comment 1 (Parent 2), Comment 2 (Parent 2), .... etc.

EDIT: My system is definitely only going to be a 2 level system (i.e. comments exist under posts, but it doesn't get any 'deeper'). I need high performance on both reads and writes but there will likely be about 5x-10x the number of reads as writes over a given interval. As such, I feel that a multi-table solution (i.e. separate tables for posts and comments) might result in higher than optimal cost for the reads. I've looked at the solution @BillKarwin linked to, which is great for trees or arbitrary depth but given that I know this constraint for my case I feel like there may be room to improve efficiency (maybe not!).

Thanks again to all who have responded, and thanks in advance to those who might.

DJSunny
  • 1,970
  • 3
  • 19
  • 27
  • 1
    possible duplicate of [What is the most efficient/elegant way to parse a flat table into a tree?](http://stackoverflow.com/questions/192220/what-is-the-most-efficient-elegant-way-to-parse-a-flat-table-into-a-tree) – Bill Karwin Sep 16 '11 at 00:47
  • @BillKarwin Thanks for pointing me to that answer. Given that the structure I'm building is only one level of depth, do you still feel that is the most most efficient methodology? The application is very heavy on both reads and writes, so speed for both are essential. I can't help but thinking the extra insert operation can be removed for better performance. Thanks again! – DJSunny Sep 16 '11 at 02:12
  • You need posts and comments to be in separate tables, so you need to do a second insert anyway. – Bill Karwin Sep 16 '11 at 03:03
  • 1
    @BillKarwin I actually don't need to have posts and comments in separate tables. I can generalize everything as a "post", but comments simply have a non null parent_id. In every other way they are exactly the same with the exception of how they are handled by the UI. Given that, is the optimal solution with speed in mind still going to be a two table approach? or do you see an efficient query set for the single table method? Much appreciated. – DJSunny Sep 16 '11 at 03:16

1 Answers1

0

This should give you the first N posts and their comments in the order you wanted.

Select post_id,parent_id from table 
    where coalesce(parent_id, post_id) in (
        select post_id from table 
            where parent_id = '' 
            order by date desc limit N
        )
    order by coalesce(parent_id, post_id), parent_id desc, date desc;

Not sure if it is the most efficient way to do it though.

TJ Shah
  • 435
  • 4
  • 17
  • 1
    What version of MySQL are you using for this query to be valid? As far as I know, in all versions of MySQL 5.x, that query provides the following error: "This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'". That is, using 'in' on a subquery that uses a limit is a no-no. – Paragon Sep 16 '11 at 22:28