5

I'm using a table that has implemented a single-linked list (id, parent). This implementation has been working well except recently performance has become unbearable since my lists are getting long and I've been querying nodes individually.

I found a promising blog on how to query this in a single query. http://explainextended.com/2009/03/25/sorting-lists/

SELECT  @r AS _parent,
        @r := (
        SELECT  id
        FROM    t_list
        WHERE   parent = _parent
        ) AS id
FROM    (
        SELECT  @r := 0
        ) vars,
        t_list

Only thing is I'm not MySQL savvy enough to even use it. Questions I have are same as I posted on the blogs comments. how to set which record/node to start from? Like if I wanted to start from id 3 in the example table. And how does it know when it hits the end of the list and should stop? I’ve tried it out and it just runs forever (likely due to improper use related to the former question).

Thanks.

btd
  • 404
  • 3
  • 12

1 Answers1

5

The query works by iterating over the t_list table (the last line). For each row in this table, the sub-query in the SELECT clause re-queries the table, searching for the current row's child (WHERE parent = _parent -- but _parent is an alias for @r). At each iteration, the child's id is assigned to the @r variable.

To add boundaries, this variation should do the trick:

SELECT * FROM (
    SELECT
        @r AS _parent,
        @r := (
            SELECT id
            FROM t_list
            WHERE
                ( @c = 0 AND _parent IS NULL AND parent IS NULL ) -- special case if the first item is the root
                OR (parent = _parent)
        ) AS id,
        @c := @c + 1 AS rank
    FROM (
        SELECT @c := 0, @r := parent FROM t_list WHERE id = @start
    ) AS ini,
    (
        SELECT id FROM t_list LIMIT @limit
    ) AS lim
) AS tmp WHERE id IS NOT NULL;

Replace @start and @limit with the id of the first item, and the maximum number of items to retrieve, respectively. Please test it here.


Modeling such a data structure with a RDBMS is probably a bad idea altogether. Why not just use an "index" column? Getting the list then becomes instant:

SELECT * FROM list ORDER BY index_column ASC;

Maybe your list is meant to change frequently, but queries like this should be fairly fast unless the list grows really large:

-- insert an element at position X 
UPDATE list SET index_column = index_column +1 WHERE index_column > X ORDER BY index_column DESC;
INSERT INTO list VALUE (some_value, X);

-- delete an element at position X 
DELETE FROM list WHERE index_column = X;
UPDATE list SET index_column = index_column -1 WHERE index_column > X ORDER BY index_column ASC;
RandomSeed
  • 29,301
  • 6
  • 52
  • 87
  • I will keep this strategy in mind for future databases I own. It's much simpler. Unfortunately, I don't have control over the database schema so I've got to make this linked-list query work. The lists are generally 1-3k nodes long and do change frequently. – btd Jul 08 '13 at 23:06
  • Glad to help. If you have some time to spare, I am curious to know how this query performs on a large data set but within a short range (eg. 10 items), in comparison with a simple `JOIN` as suggested in [this answer](http://stackoverflow.com/a/675184/1446005) to a similar question. – RandomSeed Jul 09 '13 at 00:29
  • I gave both methods a shot and they returned in less than 0.000 sec for 10 results. – btd Jul 09 '13 at 17:44
  • I think the sample set needs to be more sophisticated (SELECT * FROM t_list WHERE id BETWEEN @lower_limit AND @upper_limit). This implies that your nodes will be relatively close to each other in the table. This is not the case for me. I'm trying to get it so that when there is no record with a next_id of the current id, it stops. – btd Jul 09 '13 at 17:47
  • @btd Yes, on second thought, my solution just doesn't seem to work at all. I was still making the biaised assumption that `id`'s are sequential. Please see my edit with a second attempt. This one should work. This is one of the ugliest things I have ever written, thanks for this opportunity, it was fun! Please let me know if it runs quickly on your data set. – RandomSeed Jul 09 '13 at 18:52
  • It totally worked... in 7 seconds (list of 10 nodes). Thanks for the effort none-the-less.. uglyness is relative to perspective. – btd Jul 09 '13 at 22:55