1

Can anyone explain Oracle's limitations as far as why the following statement works in MySQL, but receives a "not a GROUP BY expression" in Oracle?

  SELECT order1.user_id, 
         order1.order_datetime, 
         SUM(order2.order_total)
    FROM order_table order1 
    JOIN order_table order2 ON order1.user_id = order2.user_id
GROUP BY order1.user_id

Is it because Oracle doesn't know how to handle the order_datetime column? Can't it just return the column result from whichever row it receives from the GROUP BY order1.user_id row, as it does in MySQL?

EDIT:

I understand that all columns should be in the group by, however I'm trying to understand why Oracle doesn't return a similar result as MySQL does (whereas MySQL doesn't require each GROUP BY, and Oracle does).

Mike
  • 2,334
  • 1
  • 23
  • 27
  • 3
    Because MySQL doesn't like kittens. – Kermit Feb 27 '13 at 18:22
  • 4
    Because MySQL thinks it's ok to just pick any arbitrary `order1.order_datetime`. For what it's worth that wonky syntax is not legal in other serious database platforms either. Just because MySQL is lax about syntax does not make it right. – Aaron Bertrand Feb 27 '13 at 18:25

3 Answers3

12

Oracle is actually performing the correct behavior. When you are using a GROUP BY the items in the select list must appear in the GROUP BY or in an aggregate function.

SELECT order1.user_id, 
         order1.order_datetime, 
         SUM(order2.order_total)
    FROM order_table order1 
    JOIN order_table order2 ON order1.user_id = order2.user_id
GROUP BY order1.user_id, order1.order_datetime

MySQL uses a EXTENSION TO GROUP BY which allows the behavior of not enforcing the FULL GROUP BY. Using this in MySQL does not guarantee what the value of the order1.order_datetime will be, MySQL just selects one value and the result can be unexpected.

You either need to use a GROUP BY or aggregate on all items in the SELECT list (similar to above) or you have to rewrite the query. You can use any of the following:

SELECT order1.user_id, 
         min(order1.order_datetime) order_datetime, 
         SUM(order2.order_total)
    FROM order_table order1 
    JOIN order_table order2 ON order1.user_id = order2.user_id
GROUP BY order1.user_id

Which applies an aggregate to the order_datetime, then you do not have to group by the date.

You can use sum() over():

SELECT order1.user_id, 
         order1.order_datetime, 
         SUM(order2.order_total) over(partition by order1.user_id) order_total
FROM order_table order1 
JOIN order_table order2 ON order1.user_id = order2.user_id

Or this can be rewritten using a subquery.

SELECT order1.user_id, 
     order1.order_datetime, 
     order2.order_total
FROM order_table order1 
JOIN
(
    select SUM(order_total) order_total, user_id
    from order_table 
    group by user_id 
) order2
    ON order1.user_id = order2.user_id
Taryn
  • 242,637
  • 56
  • 362
  • 405
  • bluefeet - Thank you for the detailed response. Is it still possible to use a group by to only retrieve 1 record per user, with the order_total from order2? – Mike Feb 27 '13 at 18:19
  • @Mike the final query that I provided gives you the total orders for each user. If you do not want to use a subquery and you do not want to `group by order_datetime`, then you can use an aggregate function around that column. – Taryn Feb 27 '13 at 18:22
  • 6
    @Mike, You can use an aggregate function, for example, MIN(order1.order_datetime) to indicate that you want the first date time. Unfortunately, it is not correct to ask about groups of items and not to specify an aggregate which determines how to select the group attributes. If you had groups of men and women, it would not be correct to ask what was the age of the group of men and the age of the group of women, you might want the average, the min or the max, or the total, but the "age" of the group is meaningless. – Cade Roux Feb 27 '13 at 18:23
  • your first query here may generate diferent results form the expected, because `group by order1.user_id, order1.order_datetime` for some dataset generates diferent groups that `group by order1.user_id` are you sure that there aren't two diferents `order_datetime` for a given `user_id` ??? – Saic Siquot Feb 27 '13 at 18:32
  • @LuisSiquot My first query was showing how to resolve the error. If they have multiple `order_datetime` then yes the result will be skewed. That is why you would have to look at either using one of the alternate versions or applying an aggregate `min/max` to the datetime column. – Taryn Feb 27 '13 at 18:35
  • anwsers that "resolve the error" asumes that they "respect the problem" OP is migrating from one RDBMS to other, and this may introduce an error. – Saic Siquot Feb 27 '13 at 18:39
  • 3
    @LuisSiquot it's also up to the OP (who accepted the answer, by the way) to *test* the query and make sure it returns the expected results. Not sure why you're berating bluefeet about this. – Aaron Bertrand Feb 27 '13 at 18:42
  • 2
    @LuisSiquot I think its pretty clear that the first query "solves the error" and the rest of the versions should be used instead. So, what could be the problem with an answer that is as complete as this one? – Lamak Feb 27 '13 at 18:44
  • 3
    @LuisSiquot I don't see any mention of migration in the question. While that is plausible, we don't know. I only see ***"Can anyone explain Oracle's limitations as far ..."***. The answer explains very well that is not a limitation of Oracle but a MySQL extension that misbehaves when not used properly. So, why should anyone suggest a query that copies a misbehaving query (a query that probably returns wrong or in the best case semi-random results)? – ypercubeᵀᴹ Feb 27 '13 at 18:48
  • 3
    @LuisSiquot -- I am not migrating. I am simply trying to understand the differences (or in your words, "respect the problem"), as I work in an environment that uses both databases. blufeet not only "resolved the error", he also explained the difference between MySQL and Oracle's GROUP BY. – Mike Feb 27 '13 at 18:49
  • 1
    @bluefeet My apologies. SHE knows her stuff. Thanks again! (And thanks for the correction Aaron) – Mike Feb 27 '13 at 18:56
  • 1
    I wanted a clear answer because I have seen may times lazy developers adding **wrong/useless** fields to group by clauses just becouse they think "nothing happens, and problem is solved" my excuses and my up vote too. – Saic Siquot Feb 27 '13 at 19:10
0

In oracle it is required to specify all columns in Select in group by, if you want to group by only one column and not all try this

SELECT order1.user_id, 
         order1.order_datetime, 
SUM(order2.order_total) OVER (PARTITION BY order1.user_id) order_total
    FROM order_table order1 
    JOIN order_table order2 ON order1.user_id = order2.user_id
rs.
  • 26,707
  • 12
  • 68
  • 90
0

See my related question here
Mysql asumes "any" value for non grouping funcions/fields while Oracle enforces the need of a grouping function or a group by field

From Mysql manual:
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values the server chooses.

Community
  • 1
  • 1
Saic Siquot
  • 6,513
  • 5
  • 34
  • 56