Show distinct tuples regardless of column order

Question

Say I have following results

----------------------
|   col1  |   col2   |
----------------------
|    a    |    b     |
|    b    |    a     |
|    c    |    d     |
|    e    |    f     |
----------------------

I would like to get distinct tuple regardless of column order. In other words, (a, b) and (b, a) are considered "same" because changing the order make one same as the other (a, b) == (a, b). So, after executing query should be:

----------------------
|   col1  |   col2   |
----------------------
|    a    |    b     | // or (b, a)
|    c    |    d     |
|    e    |    f     |
----------------------

Can any query expert help me on this? I've been stuck for few hours and wasn't able to solve this.

Below is my detailed scenario I'm working on.

I have the following relations:

Ships(name, country) // ("Lincoln", "USA") = "Ship Lincoln belongs to USA"
Battles(ship, battleName) // ("Lincoln", "WW2") = "Ship Lincoln fought in WW2"

And I need to find: List all pairs of countries that fought each other in battles

I was able to find all pairs by executing below query:

 SELECT DISTINCT c1, c2
 FROM
 (SELECT DISTINCT s1.country as c1, battleName as b1
  FROM Ships as s1, Battles
  WHERE s1.name = ship) as t1
 JOIN
 (SELECT DISTINCT s2.country as c2, battleName as b2
  FROM Ships as s2, Battles
  WHERE s2.name = ship) as t2
 ON (b1 = b2)
 WHERE c1 <> c2

And the result of executing above query is:

---------------------------------
|       c1      |       c2      |
---------------------------------
|       USA     |     Japan     |   // Row_1
|      Japan    |      USA      |   // Row_2
|     Germany   | Great Britain |   // Row_3
| Great Britain |    Germany    |   // Row_4
---------------------------------

But Row_1 and Row_2 are same as well as Row_3 and Row_4.

What I need is to print either one of Row_1 or Row_2 and either Row_3 or Row_4.

Thank you

I have strong feeling that executing JOIN on two sub-query is unnecessary. Is there any other way I can achieve this? — Harry Cho, Jan 31 '14 at 06:49
Can you provide sample data, not the results of your query, for both tables? — peterm, Jan 31 '14 at 07:14
@peterm I provided sample data in google doc (https://docs.google.com/spreadsheet/ccc?key=0ApnAXm2h6nTIdERZSGdfdXJUdngtTUE3MUpIZkpibFE&usp=sharing). After executing the query, result should be what I posted above. — Harry Cho, Jan 31 '14 at 07:27

score 4 · Accepted Answer · answered Jan 31 '14 at 07:30

4

Try it this way

SELECT DISTINCT
       LEAST(s1.country, s2.country) c1,
       GREATEST(s1.country, s2.country) c2
  FROM battles b1 JOIN battles b2
    ON b1.battlename = b2.battlename
   AND b1.ship <> b2.ship JOIN ships s1
    ON b1.ship = s1.name JOIN ships s2
    ON b2.ship = s2.name
HAVING c1 <> c2

Output:

|      C1 |            C2 |
|---------|---------------|
| Germany | Great Britain |
|   Japan |           USA |

Here is SQLFiddle demo

answered Jan 31 '14 at 07:30

peterm

91,357
15
148
157

Interesting approach the the solution – Mike Brant Jan 31 '14 at 07:36
I wasn't the question asked. I just thought the use of LEAST/GREATEST was a novel approach here. – Mike Brant Jan 31 '14 at 19:27
@MikeBrant I know. I'm sorry if it was confusing. My question was meant for OP. And thank you for your opinion :) – peterm Jan 31 '14 at 20:42
@peterm Thanks! It works like a charm!! Now I need to understand your code. Thanks again. – Harry Cho Feb 01 '14 at 03:35
1

@HarryCho You're very welcome. :) It's quite simple actually. The query uses self join on `battles` and two separate join to grab country names from both sides of the self join. Now before returning distinct set it presorts values of both countries across columns by putting lesser value in the left column and greater value in the right column by using handy for such case `LEAST()` and `GREATEST()`. – peterm Feb 01 '14 at 21:06
@peterm what if I have three or more columns? – J3STER May 02 '19 at 23:27
`select count(*), c1, c2 from myTable group by concat(least(c1,c2),greatest(c1,c2))` – Sam Barnum Jul 16 '20 at 18:06

score 0 · Answer 2 · answered Jan 31 '14 at 07:14

Here is how you can do it

Sample data

| COL1 | COL2 |
|------|------|
|    a |    b |
|    b |    a |
|    c |    d |
|    e |    f |

Query

SELECT
  k.*
FROM test k
  LEFT JOIN (SELECT
               t.col1
             FROM test t
               INNER JOIN test r
                 ON (r.col1 = t.col2
                     AND t.col1 = r.col2)
             LIMIT 1) b
    ON b.col1 = k.col1
WHERE b.col1 IS NULL

OUTPUT

| COL1 | COL2 |
|------|------|
|    a |    b |
|    c |    d |
|    e |    f |

SQL Fiddle Demo

this solution might not work for these input rows ('a','b'),('b','a'),('c','d'),('a','d'), where expected output would be ('a','b'),('c','d'),('a','d') — Dipendu Paul, Jan 31 '14 at 07:58

score 0 · Answer 3 · answered Jan 31 '14 at 08:23

That's an interesting question, looks simple but is tricky. I have tried it on SQL Server. Here is my query, assuming input table 'test' contains distinct rows:

| COL1 | COL2 |
|------|------|
|    a |    b |
|    b |    a |
|    c |    d |
|    a |    e |

SELECT t1.col1, t1.col2
FROM test t1
EXCEPT
SELECT t1.col1, t1.col2
FROM test t1
INNER JOIN test t2
ON t1.col1 = t2.col2 AND t1.col2 = t2.col1
AND t1.col1 > t1.col2

Please replace it with analogous MySQL query if it does not work verbatim. Let me know if this worked for you.

You should provide MySQL query since that is the context of the question and MySQL doesn't support `EXCEPT`. — Mike Brant, Jan 31 '14 at 19:26

Show distinct tuples regardless of column order

3 Answers3

SQL Fiddle Demo

Linked