0

I have this SQLite table:

CREATE TABLE seed (
  code TEXT PRIMARY KEY,
  name TEXT,
  grams_per_seed REAL,
  mother TEXT REFERENCES seed (code) DEFERRABLE INITIALLY DEFERRED,
  notes TEXT,
  mother_notes TEXT
) STRICT;

containing data including this:

sqlite> select code, mother from seed
   ...> where code in ("RP20-3", "RP19-1", "BC18-MG", "BC18-TI");
code     mother
-------  -------
BC18-MG
BC18-TI
RP19-1   BC18-TI
RP20-3   RP19-1

Note the ancestry chain that goes RP20-3RP19-1BC18-TI. There are many of these in the real data.

What I want is a query that maps the seed codes to the length of their ancestry chains. For example:

sqlite> select [... something ...] from [... something else ...];
code     a_ct
-------  ----
BC18-MG     0
BC18-TI     0
RP19-1      1
RP20-3      2

I’ve gotten partway there with this kludge, but I know it’s not right because the maximum chain length is encoded in the join width:

sqlite> select child.code, parent.code, grand.code
   ...> from seed as child, seed as parent, seed as grand
   ...> where     child.code = "RP20-3"
   ...>       and child.mother = parent.code
   ...>       and parent.mother = grand.code;
code    code    code
------  ------  -------
RP20-3  RP19-1  BC18-TI

I’m pretty sure this requires a recursive query, but I haven’t been able to get it to work. I’ve tried several queries designed to return one row per member of the chain that I can count, but I get either no rows or an infinite number of them.

This question is very similar to what I want, but it returns ancestors only for a single item, rather than all items.

I’ve found a number of examples of recursive queries, including parent-child examples. It seems like this should be relatively straightforward, but I’m getting stuck somewhere. I do have a good handle on recursion in procedural languages. What am I missing?

Reid
  • 1,999
  • 3
  • 17
  • 25

2 Answers2

1

It may be useful, examples with recursive queries

with recursive r as (
 select code chainhead,1 chainnum,code,mother from seed s1 
 where not exists(select * from seed s2 where s2.code=s1.mother)
 union all 
 select chainhead ,chainnum+1 n,s.code,s.mother
 from r inner join seed s on s.mother=r.code
 )
select chainhead
  ,min(case when chainnum=1 then code else null end) link0 
  ,min(case when chainnum=2 then code else null end) link2 
  ,min(case when chainnum=3 then code else null end) link3 
  ,min(case when chainnum=4 then code else null end) link4 
from r
group by chainhead;
chainhead link0 link2 link3 link4
BC18-MG BC18-MG (null) (null) (null)
BC18-TI BC18-TI RP19-1 RP20-3 (null)

You may add as many as needed columns "linkN". In all cases column count is fixed.

In next example, accumulate chain in text. Limitation is recursion depth, or max TEXT size, not fixed.

with recursive rs as (
 select code chainhead,1 linknum,code,mother, code path from seed s1 
 where not exists(select * from seed s2 where s2.code=s1.mother)
 union all 
 select chainhead ,linknum+1 n,s.code,s.mother,r.path||','||s.code
 from rs r inner join seed s on s.mother=r.code
 )
select * from rs;
chainhead linknum code mother path
BC18-MG 1 BC18-MG BC18-MG
BC18-TI 1 BC18-TI BC18-TI
BC18-TI 2 RP19-1 BC18-TI BC18-TI,RP19-1
BC18-TI 3 RP20-3 RP19-1 BC18-TI,RP19-1,RP20-3
ValNik
  • 1,075
  • 1
  • 2
  • 6
0

This is the query I ended up with (very similar to @ValNik’s):

with recursive rs as (
    select id, mother, 1 as gen, id as path
        from seed s1 
        where not exists (select 1 from seed s2 where s2.id = s1.mother)
    union all 
    select s.id, s.mother, gen + 1, s.id || ',' || r.path
        from rs r inner join seed s on s.mother = r.id )
select * from rs
order by gen desc, id
Reid
  • 1,999
  • 3
  • 17
  • 25