0

I have a large, complex query that combines multiple queries into a single output by using UNION.

It's essentially this:

SELECT *
FROM (SELECT query1.field1,
             subquery.field1,
             GROUP_CONCAT(DISTINCT subquery.field1 SEPARATOR ', ') AS 'Column3',
             NULL AS 'Column4'
      FROM (SELECT table1.field1, table1.field2, subquery1.field1
            FROM table1
            INNER JOIN (SELECT table2.field1
                        FROM table2) AS subquery1 ON table1.field1 = subquery.field1)
      GROUP BY subquery.field1) AS query1
UNION 
SELECT *
FROM (SELECT query1.field1,
             subquery.field1,
             GROUP_CONCAT(DISTINCT subquery.field1 SEPARATOR ', ') AS 'Column3',
             NULL AS 'Column4'
      FROM (SELECT table1.field1, table1.field2, subquery1.field1
            FROM table1
            INNER JOIN (SELECT table2.field1
                        FROM table2) AS subquery1 ON table1.field1 = subquery.field1)
      GROUP BY subquery.field1) AS query2
ORDER BY field1

This is very very overly simplified (there are upwards of 15 subqueries in the whole thing). It IS returning the results set I want. If I split out each group and run the code for just "query1" or "query2", I get the correct results. The problem I'm running into is when I UNION them into a single results set.

Essentially, I want a set returned for Query1 and a set returned for Query2. But I want these grouped by person (the identifier is field1), so that each row belongs to only a single person and displays information in the columns for both query 1 AND query 2.

What I'm getting right now is the list of people, and if they meet both criteria, they get 2 lines. One for appearing in query1 and one for appearing in query2. Some people will appear in both, some people will only appear in one or the other.

I want to see:

column 1 | column 2 | column 3 | column 4
field 1  | field 2  | field 3  | field 4

What I'm getting instead is:

column 1 | column 2 | column 3 | column 4
field 1  | field 2  | field 3  | NULL
field 1  | field 2  | NULL     | field 4

I can ORDER BY field1 so that it places these two lines together, but if I GROUP BY field1, I get a syntax error for unknown column.

I don't understand how it's a known column for ORDER BY but unknown for GROUP BY.

Ankit Bajpai
  • 13,128
  • 4
  • 25
  • 40
Mandi Kaye
  • 31
  • 3
  • At first seight i would have to say that your query is much more complex then it has to be.. Well GROUP_CONCAT can be used but i think you will have to use nested SUBSTRING_INDEX to parse out the comma separated string.. But most of the time pivotting is easier with SUM()/MAX()/MIN() functions. – Raymond Nijland Aug 26 '19 at 17:15
  • 1
    Take a look into [MySQL pivot table](https://stackoverflow.com/questions/7674786/mysql-pivot-table) which is what you seams to want here... if that does not help see [Why should I provide a Minimal Reproducible Example for a very simple SQL query?](https://meta.stackoverflow.com/questions/333952/why-should-i-provide-a-minimal-reproducible-example-for-a-very-simple-sql-query) – Raymond Nijland Aug 26 '19 at 17:19

0 Answers0