18

Lets say, we have this table :

+------+------+
| COL1 | COL2 |
+------+------+
|   A  |   B  |
+------+------+
|   B  |   A  |
+------+------+
|   C  |   D  |
+------+------+

I want to count the number of times either letter1, letter2 or letter2, letter1 appears in the two columns.

I want the result :

+------+------+------+
| COL1 | COL2 | COL3 |
+------+------+------+
|   A  |   B  |   2  | 
+------+------+------+
|   C  |   D  |   1  |
+------+------+------+

NOTE: It can be either AB or BA doesn't matter.

I tried:

SELECT
COL1,COL1,COUNT(*) AS COL3
FROM
X
GROUP BY COL1,COL2;

But that gets me :

+------+------+------+
| COL1 | COL2 | COL3 |
+------+------+------+
|   A  |   B  |   1  |
+------+------+------+
|   B  |   A  |   1  |
+------+------+------+
|   C  |   D  |   1  |
+------+------+------+
Arulkumar
  • 12,966
  • 14
  • 47
  • 68
Hamid Y.
  • 185
  • 7
  • Something similar is achieved with a "covariance matrix." However the results would still have to be merged - respectively added up. – AnyOneElse Jul 01 '15 at 10:12

4 Answers4

15

You can do this by swapping the columns if you need to:

SELECT Col1, Col2, COUNT(*)
FROM
(
    SELECT
        CASE WHEN Col1 < Col2 THEN Col1 ELSE Col2 END AS Col1,
        CASE WHEN Col1 < Col2 THEN Col2 ELSE Col1 END AS Col2
    FROM T
) t
GROUP BY Col1, Col2

Fiddle

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
Damien_The_Unbeliever
  • 234,701
  • 27
  • 340
  • 448
3

Another try

SELECT LEAST(col1, col2) col11, GREATEST(col1, col2) col12 , COUNT(1) FROM X
GROUP BY col11, col12

SqlFiddle

Akhil
  • 2,602
  • 23
  • 36
0

UPDATE: : Use @Damien's answer. Another try.

You can try the below code. Fiddle

   SELECT COL1, COL2,  COUNT(*) AS COL3
   FROM (
    SELECT
    LEAST(COL1,COL2) AS COL1,
    GREATEST(COL1,COL2)  AS COL2
    FROM X
     ) AS Temp
    GROUP BY COL1,COL2;
Community
  • 1
  • 1
sudhansu63
  • 6,025
  • 4
  • 39
  • 52
  • 1
    CTE is SQL Server specific. https://msdn.microsoft.com/en-us/library/ms190766.aspx – rurouni88 Jul 01 '15 at 06:52
  • 1
    @rurouni88 - it's not SQL Server specific. It's standard SQL (Oracle and Postgre also support it). It's just the MySQL doesn't follow the standard here. – Damien_The_Unbeliever Jul 01 '15 at 06:55
  • @Damien_The_Unbeliever: Fair enough. At any rate, it's not supported here. Let's not get me off on a rant. I spent the past week looking at different DB platforms different implementation of RAND() with SEED() :P – rurouni88 Jul 01 '15 at 06:58
  • 1
    Your query seems wrong if OP has different data. See this [**fiddle**](http://www.sqlfiddle.com/#!9/b0e7e/3) – Raging Bull Jul 01 '15 at 07:04
  • @RagingBull you are right, i misinterpreted the question – sudhansu63 Jul 01 '15 at 07:15
  • 2
    you need `LEAST()` and `GREATEST()`, not MIN and MAX. – ypercubeᵀᴹ Jul 01 '15 at 08:26
  • 1
    @sudhAnsu63 You may need to give different alias for your fields. Currently your query is not giving desired result. Check your fiddle itself – Akhil Jul 01 '15 at 09:33
  • 1
    `GROUP BY COL1, COL2` prematurely creates duplicate rows (e.g. one for `a, b` and another for `b, a`). – Salman A Jul 01 '15 at 11:39
-1

See http://sqlfiddle.com/#!9/4bd6a/23

Use if statements and concat the 2 columns.

SELECT
  DISTINCT (CONCAT(C1,C2)) AS permutation, COUNT(1)
  FROM (SELECT
    IF(col1<=col2, col1, col2) as C1,
    IF(col2<col1, col1, col2) as C2
  FROM X) AS T
  GROUP BY permutation
;

Further explanation: The if statements merely order the characters by ASCII value, so regardless of 'AB' or 'BA', it will always be represented as 'AB'

rurouni88
  • 1,165
  • 5
  • 10
  • That may work with one character values, but if the real values were 'AA' and 'BB' this would falsely match 'AAB' and 'B'. – abligh Jul 01 '15 at 08:21