1

my SQLite database is kinda like this:

startn rel endn
immunodeficiencies includes B-cell deficiencies
B-cell deficiencies includes Bruton agammaglobulinemia
immunodeficiencies includes T-cell deficiencies
T-cell deficiencies includes DiGeorge syndrome

There's a lot more values, but that's the gist.

I'm trying to make a recursive query that gets all the terminal child nodes of immunodeficiencies. Ie, for this example I'd like Bruton agammaglobulinemia, DiGeorge syndrome

Based on another answer, here's what I have so far, but it's not working:

with recursive tc( i )
  as ( select endn from dx_graph where startn = 'immunodeficiencies' and rel = 'includes'
        union endn from dx_graph, tc where startn = tc.i and rel = 'includes'
     )
 select * from tc;

sample data:

CREATE TABLE "dx_graph" (
    "startn"    TEXT,
    "rel"   TEXT,
    "endn"  TEXT,
)

insert into dx_graph values ('immunodeficiencies', 'includes', 'B-cell deficiencies')
insert into dx_graph values ('immunodeficiencies', 'includes', 'T-cell deficiencies')
insert into dx_graph values ('immunodeficiencies', 'includes', 'complement disorders')
insert into dx_graph values ('B-cell deficiencies', 'includes', 'Bruton agammaglobulinemia')
insert into dx_graph values ('T-cell deficiencies', 'includes', 'DiGeorge syndrome')
insert into dx_graph values ('complement disorders', 'includes', 'C1 esterase inhibitor deficiency')
insert into dx_graph values ('complement disorders', 'includes', 'C5-C9 deficiency')

1 Answers1

1

Your recursive query will find all child nodes of the original specified parent; you then need to filter them based on whether they have any child nodes or not. You can do this with a NOT EXISTS clause:

WITH RECURSIVE tc AS (
  SELECT endn
  FROM dx_graph 
  WHERE startn = 'immunodeficiencies' AND rel = 'includes'
  UNION ALL
  SELECT dx.endn
  FROM dx_graph dx
  JOIN tc ON dx.startn = tc.endn AND dx.rel = 'includes'
)
SELECT endn
FROM tc
WHERE NOT EXISTS (
  SELECT *
  FROM dx_graph dx
  WHERE dx.startn = tc.endn AND dx.rel = 'includes'
)

Output (for sample data in question):

endn
Bruton agammaglobulinemia
DiGeorge syndrome
C1 esterase inhibitor deficiency
C5-C9 deficiency

Demo on db-fiddle

The NOT EXISTS clause filters out any nodes from the recursive CTE that have a child node (i.e. the endn value from the CTE matches a startn value in the original table that has a rel='includes'). That way you only get the terminal nodes, as they have no children.

Nick
  • 138,499
  • 22
  • 57
  • 95