0

There is a related question here: SO - Mysql Recursive Query

I am trying to determine how to create a recursive query in mysql (version>=8) which can gather all rows which are in some way 'linked' to the original row. The structure is relatively simple:

CREATE TABLE link (
  id INT,
  left_id INT,
  right_id INT
);

CREATE TABLE item (
  id INT,
  content VARCHAR(100)
);

INSERT INTO item (id, content) VALUES (1, 'first linked item');
INSERT INTO item (id, content) VALUES (2, 'unlinked item');
INSERT INTO item (id, content) VALUES (3, 'second linked item');
INSERT INTO item (id, content) VALUES (4, 'third linked item');
INSERT INTO item (id, content) VALUES (5, 'fourth linked item');
INSERT INTO item (id, content) VALUES (6, 'second group item 1');
INSERT INTO item (id, content) VALUES (7, 'second group item 2');

INSERT INTO link (id, left_id, right_id) VALUES (1, 1, 3);
INSERT INTO link (id, left_id, right_id) VALUES (2, 4, 3);
INSERT INTO link (id, left_id, right_id) VALUES (2, 5, 1);
INSERT INTO link (id, left_id, right_id) VALUES (4, 6, 7);

Here we make a number of "item" rows. At a later time we can decide to link one item to another arbitrarily, by adding a row to the "link" table. "left_id" and "right_id" are just the id of one item and another item. It means no different whether a link is done with one item as left or right.

The query should start with a specified item or items, find links with matching left_id or right_id and add items to the results, then for these new rows, check for more links, etc, until there are none left.

For the test data provided above, I would expect:

Calling query with condition "content = 'first linked item'":
Result: 
1 | 'first linked item'
3 | 'second linked item'
4 | 'third linked item'
5 | 'fourth linked item'
Calling query with condition "content = 'second group item 2'":
Result:
6 | 'second group item 1'
7 | 'second group item 2'

What facilities exist in mysql 8+ to invoke such queries efficiently?

Rboreal_Frippery
  • 2,016
  • 1
  • 19
  • 22

1 Answers1

1

You can walk the graph using a recursive CTE. For example:

with
recursive i as (
  select id, content, -1 as source 
  from item where content = 'first linked item'
 union all
  select m.id, m.content, i.id
  from i
  join link l on l.left_id = i.id or l.right_id = i.id
  join item m on m.id <> i.id and m.id <> i.source and m.id = l.left_id
              or m.id <> i.id and m.id <> i.source and m.id = l.right_id
)
select id, content from i;

Then, just change the search condition in line 3, to start from a different item.

This solution will work well, as long as the graph doesn't have cycles.

EDIT: Solution that deals with cycles

As requested, here's the solution that works well with cycles:

with
recursive i as (
  select id, content, concat(',', id, ',') as route 
  from item where content = 'first linked item'
 union all
  select m.id, m.content, concat(i.route, m.id, ',')
  from i
  join link l on l.left_id = i.id or l.right_id = i.id
  join item m on i.route not like concat('%,', m.id, ',%')
    and (m.id = l.left_id or m.id = l.right_id)
)
select distinct id, content from i;

See DB Fiddle where I introduced a cycle.

The Impaler
  • 45,731
  • 9
  • 39
  • 76
  • Appreciate the starting point. However, I tried running this on my example unmodified: https://www.db-fiddle.com/f/3Nzm3x239y4rJxGnnuXxRf/0 -- even with just the first link row, it exceeds recursion depth. – Rboreal_Frippery Mar 30 '20 at 17:05
  • Fixed it. See https://www.db-fiddle.com/f/3Nzm3x239y4rJxGnnuXxRf/0 -- Please note it will work as long as there are no cycles. – The Impaler Mar 30 '20 at 17:21
  • Thank you, I think it is fairly self documenting. However, the no-cycles requirement would likely be a debugging a usability nightmare, as a single bad insert could cause numerous timeouts. I don't believe there would be a way to index for this. Is there any mechanism to build a list of link ids such that no link could be visited twice? – Rboreal_Frippery Mar 30 '20 at 17:50
  • @Rboreal_Frippery See new solution. – The Impaler Mar 30 '20 at 18:35