SQL Find all transitive matches throughout two columns of matching products

Question

I have a table of product numbers, where the product on the left is an equivalent for the product on the right. Assuming something similar to the transitive property, if product A is a match for B, and B is a match for C, then A is also a match for C. This extends out as far as possible, where if C is also a match for D, then B and A are also matches for D.

I'm trying to find a way that I can snake through both lists and, using a flag, indicate which products group together.

I've tried a few combinations of self-joins and cross-joins but there are too many matches in each group (potentially hundreds) across too many rows of products (about 50,000). I think a recursive CTE may be a solution, but I'm still a bit fuzzy on how they work exactly.

Basically, my data looks like this:

Product1 | Product2
  123    |   124
  124    |   126
  125    |   123
  126    |   125
  127    |   129
  128    |   127
  129    |   130
  130    |   128

In this case,

123 = 124 = 125 = 126

and

127 = 128 = 129 = 130

Therefore, the final output should look like:

Product | Group
  123   |   1
  124   |   1
  125   |   1
  126   |   1
  127   |   2
  128   |   2
  129   |   2
  130   |   2

I realize that SQL may not be the best application to solve this problem, but this is part of an overall business process that all runs in SQL, so I don't have the option to use Python or something that is more flexible and suited for this kind of task. Any help would be very appreciated.

If I'm not mistaken, this question is very similar to the one asked [here](https://stackoverflow.com/q/35254260/243373). There are two answers to this question: one with recursive CTEs and one a bit more convoluted. They both give the correct output, just note that for very large input sets the convoluted answer (mine ;)) will perform better. — TT., Sep 20 '17 at 09:31
That's amazing! The output is exactly what I'm looking for based on the post. I'll need some time to go through it and understand how it's working but it's such a relief to know a solution exists. — happyhippo83, Sep 20 '17 at 11:25

score 0 · Answer 1 · answered Sep 20 '17 at 08:02

0

You'd better construct a undirected graph and DFS it, finding all cycle, each cycle is a group, results are these groups.

answered Sep 20 '17 at 08:02

VbcPoc

21

SQL Find all transitive matches throughout two columns of matching products

1 Answers1