0

I recently came across in my inherited program a long SQL query that joined 8 tables and 3 views. Using EXPLAIN, I saw it was 7 Unique Key lookups and 4 non-unique key lookups. On average it took 18 seconds to fetch 350 rows (there's a number of reasons for that, each of those 3 views is made up of other views for one) Then, I noticed a GROUP BY tableone.id without any aggregate. I deleted that and now it runs in milliseconds.

Now the confusing part is that I then looked up why MySQL allows a GROUP BY statement without an aggregate function and learned it's actually an optimizing technique (Why does MySQL allow "group by" queries WITHOUT aggregate functions?).

This was obviously NOT the case in my situation. So why is that? When is a dangling GROUP BY a hindrance and not an optimizer?

GMB
  • 216,147
  • 25
  • 84
  • 135
bjk116
  • 531
  • 3
  • 16

2 Answers2

1

The GROUP BY clause, even without an actual aggregate function being used, implies additional processing for the RDBMS, in order to check if some records need to be aggregated. Thus the boost that you are seeing when removing an unnecessary GROUP BY.

The link you shared explains that this somehow loose behavior from MySQL might have been designed as a way to shorten the syntax of aggregate queries where grouping by one field would imply other fields are also being grouped, and possibly as an optimization as well. Anyway this does not properly fit your use case, where you don’t actually need aggregation.

GMB
  • 216,147
  • 25
  • 84
  • 135
0

The use of group by without aggregation function is not more allowed starting from mysql 5.6 (for obvious reasons.

For the versions previos then 5.7 the group by clause work extracting a (causal ) value for all the column not in aggregated function .. this others then an unpredictable result for these columns .. produce the need of work forscan all the rows and extract the result with a degradation of the performance .

ScaisEdge
  • 131,976
  • 10
  • 91
  • 107
  • I forgot to mention, yes I am using MySQL 5.6. In other queries in this program I saw the same thing and deleted the GROUP BY thinking it would speed it up like this one, instead they now run slower. Just curious as to why. I'm not sure I completely understand what you are saying there. – bjk116 Jan 04 '19 at 20:30
  • for each group by the db engine must produce a temp table for store the (not) aggregated result and then after the ful scan for the group by .. use the rsult for the other part of the query needs .. – ScaisEdge Jan 04 '19 at 20:32
  • Um, I think it's the other way around – Strawberry Jan 05 '19 at 08:06