-1

Following is a sample of the records that I have in a table. EntriesPerSet column contains the number of records in the table for that SET_ID. I need to establish equivalence between rows belonging to different SET_IDs. Sets will be equivalent, if and only if they contain an equal number of entries per Set, and every entry has a corresponding entry in the other set (by checking values in K1 & K2). In following case, SET_IDs 1 and 2 are equivalent.

SET_ID  K1  K2  EntriesPerSet
1   a   b   4
1   c   d   4
1   e   f   4
1   g   h   4
2   a   b   4
2   c   d   4
2   e   f   4
2   g   h   4
3   a   b   5
3   c   d   5
3   e   f   5
3   g   h   5
3   i   j   5
4   a   b   3
4   c   d   3
4   e   f   3
5   a   b   4
5   c   d   4
5   e   f   4
5   p   q   4

Please help me with how to do this. Thanks!

Lamak
  • 69,480
  • 12
  • 108
  • 116

1 Answers1

0

If you just want to know which pairs are equivalent, you can just use a common table expression to get all possible combinations, and an INTERSECT to figure out which of them overlap fully;

WITH cte AS (
 SELECT DISTINCT a.SET_ID aid, b.SET_ID bid, a.EntriesPerSet
 FROM mysets a
 JOIN mysets b ON a.EntriesPerSet = b.EntriesPerSet AND a.SET_ID < b.SET_ID
)
SELECT aid, bid FROM cte
WHERE EntriesPerSet = (
 SELECT COUNT(*) FROM (
  SELECT K1,K2 FROM mysets WHERE SET_ID=aid
  INTERSECT
  SELECT K1,K2 FROM mysets WHERE SET_ID=bid
 ) a
)

An SQLfiddle to test with.

Joachim Isaksson
  • 176,943
  • 25
  • 281
  • 294
  • Just a quick question based on the output of this program. If a set has multiple sets that are equivalent, say 1 is equivalent to 2 and 3, i get the following output: 1-2, 1-3, 2-3. Is there a way to eliminate the rows like 2-3 within the common table expression itself? My ultimate goal is to identify unique buckets where i can put these sets (a bucket contains all equivalent sets). – user3525048 Apr 15 '14 at 20:46