Group By and Order By for SQL query

Question

I'm trying to generate a report that shows how many Expense Reports (ERs) and the total of those reports for each employee for FY18. The result set I have now has several duplicate UIN numbers--which is the identifying ID number for employees. There should only be one record for each UIN. I was thinking it had to do with the Group By and Order By clauses, but not sure. Can someone point out the problem area? Here's the SQL:

select x.er_employee_number "UIN", 
       x.er_user_full_name, 
       x.er_user_location, 
       count(X.ER_DOCUMENT_ID) "Number of ERs", 
       sum(X.ER_TOTAL_EXPENSE) "ER Total"
from INFOR.ALER_EXPENSE_EXPORT_HEADER@TEMPROD x
where X.EXPORT_HDR_STATUS like '4' 
  and X.ER_PURPOSE like 'Employee%'
  and NEW_TIME(X.EXPORT_HDR_STATUS_DATETIME, 'GMT', 'CST') > to_date('1-Jul-2017')
  and NEW_TIME(X.EXPORT_HDR_STATUS_DATETIME, 'GMT', 'CST') < to_date('30-Jun-2018')
group by x.er_employee_number,
         x.er_user_full_name,
         x.er_user_location,
         X.ER_DOCUMENT_ID,
         X.ER_TOTAL_EXPENSE;

I believe your problem is with your `group by` statement. You should only `group by` fields not including in aggregate functions. So in your case, `group by x.er_employee_number, x.er_user_full_name, x.er_user_location` should work if those all return distinct values. — sgeddes, Aug 14 '18 at 18:27
Show us db schema, sample data, current and expected output. Please read [**How-to-Ask**](http://stackoverflow.com/help/how-to-ask) And here is a great place to [**START**](http://spaghettidba.com/2015/04/24/how-to-post-a-t-sql-question-on-a-public-forum/) to learn how improve your question quality and get better answers. [**How to create a Minimal, Complete, and Verifiable example**](http://stackoverflow.com/help/mcve) Try create a sample in http://rextester.com — Juan Carlos Oropeza, Aug 14 '18 at 18:37

score 1 · Answer 1 · answered Aug 14 '18 at 19:15

It seems you know that when a GROUP BY clause is present in a query, any column in the select list must be in either:

A - an aggregate function
B - the GROUP BY list

You may not fully understand the behaviour of the GROUP BY clause when multiple columns are specified. This answer explains it pretty well. Essentially, the query is creating groups wherein each record contains the same x.er_employee_number, x.er_user_full_name, x.er_user_location, X.ER_DOCUMENT_ID, AND X.ER_TOTAL_EXPENSE. This is likely what is causing your duplicates.

As is, you can remove X.ER_DOCUMENT_ID and X.ER_TOTAL_EXPENSE from the GROUP BY list, as they are already used in aggregate functions.

Perhaps adding a function like MAX to both x.er_user_full_name and x.er_user_location, and leaving only x.er_employee_number in the GROUP BY list will get you the results you need. Note that the MAX function will effectively do nothing, so long as both the x.er_user_full_name and x.er_user_location are unique and not null per each x.er_employee_number See below:

select x.er_employee_number "UIN", 
   MAX(x.er_user_full_name), 
   MAX(x.er_user_location), 
   count(X.ER_DOCUMENT_ID) "Number of ERs", 
   sum(X.ER_TOTAL_EXPENSE) "ER Total"
from INFOR.ALER_EXPENSE_EXPORT_HEADER@TEMPROD x
where X.EXPORT_HDR_STATUS like '4' 
  and X.ER_PURPOSE like 'Employee%'
  and NEW_TIME(X.EXPORT_HDR_STATUS_DATETIME, 'GMT', 'CST') > to_date('1-Jul-2017')
  and NEW_TIME(X.EXPORT_HDR_STATUS_DATETIME, 'GMT', 'CST') < to_date('30-Jun-2018')
group by x.er_employee_number

Group By and Order By for SQL query

1 Answers1