Imagine the following simple table (T) where ID is the primary key:
ID | Column1 | Column2 |
----|---------+----------|
1 | A | X |
2 | A | Y |
Then you write the following query
SELECT ID, Column1, Column2
FROM T
GROUP BY Column1;
This breaks the SQL Standard, and if it were to run without errors (which it would in MySQL), the result:
ID | Column1 | Column2 |
----|---------+----------|
1 | A | X |
Is no more or less correct than
ID | Column1 | Column2 |
----|---------+----------|
2 | A | Y |
So what you are saying is give me one row for each distinct value of Column1
, which both results sets satisfy, so how do you know which one you will get? Well you don't.
For simplicity sake (and the way it is implemented in SQL Server) we state the rule that if an column is not contained in an aggregate function, it must be in the GROUP BY clause for it to appear in the select list. This is not strictly true, the SQL-Standard does allow columns in the select list not contained in the GROUP BY or an aggregate function, however these columns must be functionally dependent on a column in the GROUP BY. From the SQL-2003-Standard (5WD-02-Foundation-2003-09 - page 346) - http://www.wiscorp.com/sql_2003_standard.zip
15) If T is a grouped table, then let G be the set of grouping columns of T. In each contained
in , each column reference that references a column of T shall reference some column C that
is functionally dependent on G or shall be contained in an aggregated argument of a
whose aggregation query is QS.
For example, ID in the sample table is the PRIMARY KEY, so we know it is unique in the table, so the following query conforms to the SQL standard and would run in MySQL and fail in many DBMS currently (At the time of writing Postgresql is the closest DBMS I know of to correctly implementing the standard - Example here):
SELECT ID, Column1, Column2
FROM T
GROUP BY ID;
Since ID is unique for each row, there can only be one value of Column1
for each ID, one value of Column2
there is no ambiguity about
what to return for each row. As far as I know, Postgresql is the only DBMS that has gone anyway to implementing this.
In order for your query to work you would need to add some columns to the GROUP BY
:
GROUP BY wo.workorderid, wo.CREATEDTIME, aau.FIRST_NAME, aac.EMAILID, cd.CATEGORYNAME
However, I think you can remove the issue of duplicates by removing workorder_recipients
from your FROM
, you don't appear to use this anywhere. Removing this reference should remove the need for GROUP BY
SELECT
[Request ID] = wo.WORKORDERID,
[Created on] = wo.CREATEDTIME,
[Requester] = aau.FIRST_NAME,
[From] = aac.EMAILID,
[To] = STUFF((SELECT ', ' + Recipient_email
FROM workorder_recipients wor2
WHERE wor2.Workorderid = wo.Workorderid
AND wor2.To_cc_bcc='To'
FOR XML PATH('')), 1, 2, ''),
[CC] = STUFF((SELECT ', ' + Recipient_email
FROM workorder_recipients wor2
WHERE wor2.Workorderid = wo.Workorderid
AND wor2.To_cc_bcc='CC'
FOR XML PATH('')), 1, 2, ''),
[Category] = cd.CATEGORYNAME
FROM workorder wo
LEFT JOIN ModeDefinition AS mdd
ON wo.MODEID = mdd.MODEID
LEFT JOIN SDUser AS sdu
ON wo.REQUESTERID = sdu.USERID
LEFT JOIN AaaUser AS aau
ON sdu.USERID = aau.USER_ID
LEFT JOIN SDUser AS crd
ON wo.CREATEDBYID = crd.USERID
LEFT JOIN AaaUser AS cri
ON crd.USERID = cri.USER_ID
LEFT JOIN AaaUserContactInfo AS aauc
ON aau.USER_ID = aauc.USER_ID
LEFT JOIN AaaContactInfo AS aac
ON aauc.CONTACTINFO_ID = aac.CONTACTINFO_ID
LEFT JOIN WorkOrderStates AS wos
ON wo.WORKORDERID = wos.WORKORDERID
LEFT JOIN CategoryDefinition AS cd
ON wos.CATEGORYID = cd.CATEGORYID
WHERE
mdd.MODENAME = 'E-Mail'
AND cd.CATEGORYNAME in ('Agent Operational Technology (EMEA/UK/IE)','Client Technology')
AND wo.IS_CATALOG_TEMPLATE='0'
AND wo.CREATEDTIME >= 1416783600000
AND wo.CREATEDTIME <= 1417388399000
AND wo.ISPARENT='1';