0

I'm very noob when it comes to SQL and MySQL. I've discovered self-joining as a way to do comparisons of "x" where column A is the same e.g.

Tab1:

id    colA    x
-----------------
1     q1      0.1
1     q2      0.5
1     q3      0.7
2     q1      0.4
2     q2      0.9
2     q3      1.3
3     q1      0.2
3     q2      0.4
3     q3      0.1    

QUERY:

select a.`colA`, a.`x`, b.`x`, c.`x`
from `tab1` as a
join `tab1` as b on a.`colA` = b.`colA`
join `tab1` as c on a.`colA` = c.`colA`
where a.`id` = 1 and b.`id` = 2 and c.`id` = 3

Gives me:

colA    x      x      x
q1      0.1    0.4    0.2
q2      0.5    0.9    0.4
q3      0.7    1.3    0.1

Brilliant! I also just discovered that MySQL isn't capable of full outer joins, so using left, right or plain old join will give me the MINIMUM rows, i.e. if any colA values are missing for an id then they will be completely missing from the results, indeed if I take away q1 and q2 in any of the ids above I get one row.

q3      0.7    1.3    0.1

I've seen quite a few links on how to use UNION / UNION ALL to emulate a full outer join which I could frankenstein but these don't give me results as a nice comparison table. Is there not an easier way to tell the query during a SELF join to NOT ignore null values? I feel like this should be easy but have been scratching my head for hours and have googled so many answers I don't think I can see the wood because of all the trees.

jamheadart
  • 5,047
  • 4
  • 32
  • 63
  • 1
    Possible duplicate of [MySQL pivot table](https://stackoverflow.com/questions/7674786/mysql-pivot-table) and if it needs to be dynamic: https://stackoverflow.com/questions/12004603/mysql-pivot-row-into-dynamic-number-of-columns – xQbert Sep 13 '17 at 15:05
  • Inner join, left join, full join, union, restrict etc are operators on tables. A query is an expression tree. It is not helpful to think of a query as "being" a variation on an operator, maybe that happens to be at the top of the tree for one of the trees that give what you want. You need to think of how to transform input to output. Here you are not explaining what your query is suposed to return. Your "give me the MINIMUM rows" is not clear, and your description of what you *do* want is not very clear either. (Which seems to be an inner join of three left joins.) Please read & act on [mcve]. – philipxy Sep 16 '17 at 06:07

1 Answers1

3

I'd take a different approach; avoid the full outer join and self joins entirely.

I think what you're after is simulating a pivot in mySQL which can be done with case statements and an aggregate with proper group by.

DEMO: http://rextester.com/NGFNJG9485

SELECT ColA
     , max(Case when ID = 1 then x end) as x1
     , max(Case when ID = 2 then x end) as x2
     , max(Case when ID = 3 then x end) as x3
FROM tab1
GROUP BY colA

Giving us:

+----+------+-----+-----+-----+
|    | ColA | x1  | x2  | x3  |
+----+------+-----+-----+-----+
|  1 | q1   | 0,1 | 0,4 | 0,2 |
|  2 | q2   | 0,5 | 0,9 | 0,4 |
|  3 | q3   | 0,7 | 1,3 | 0,1 |
+----+------+-----+-----+-----+

or if we eliminate the data point for ID 3, colA 'q2':

+----+------+-----+-----+------+
|    | ColA | x1  | x2  |  x3  |
+----+------+-----+-----+------+
|  1 | q1   | 0,1 | 0,4 | 0,2  |
|  2 | q2   | 0,5 | 0,9 | NULL |
|  3 | q3   | 0,7 | 1,3 | 0,1  |
+----+------+-----+-----+------+

This assumes

  • you will add more Q#'s not more IDs. Put another way, we don't need to add more columns as ID will always be 1,2,3. If you do need more columns (Id) simply add additional cases; or use dynamic SQL as mentioned in top level comment 2nd link.
  • ID and colA are unique. If duplicates exist then we would need to discuss what value of x should be presented. An average? sum? what? otherwise, min/max/avg would all work.

This handles the nulls as needed since max of column only having null is null we could also coalesce() the result to show 0 if preferred over null. For example:coalesce(max(Case when ID = 3 then x end),0) as x3

Giving us:

+----+------+-----+-----+-----+
|    | ColA | x1  | x2  | x3  |
+----+------+-----+-----+-----+
|  1 | q1   | 0,1 | 0,4 | 0,2 |
|  2 | q2   | 0,5 | 0,9 | 0,0 |
|  3 | q3   | 0,7 | 1,3 | 0,1 |
+----+------+-----+-----+-----+
xQbert
  • 34,733
  • 2
  • 41
  • 62
  • This all seems amazing. I think I was asking the wrong question really, seems I need to be emulating a pivot table. I don't have the time to go over properly just this second but I wanted to say thanks and initial thoughts are that unfortunately the ID count will always be rising so I'm wondering if I can create a stored procedure to handle the ID cases... other than that I can't wait to try it out. – jamheadart Sep 13 '17 at 15:20
  • Bluefeet is the Dynamic Pivot King. A great dynamic example (2nd link in top comments) As for asking the wrong question: that's something we always look out for it's called the [XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). Instead of solving the symptom your asking about; we want to address the root cause. You were going down the wrong path in my opinion making the solution harder and harder; when what you really needed (i thought) was a pivot in mySQL (I didn't know if it was dynamic or static till your last comment) – xQbert Sep 13 '17 at 15:47