0

We have two mysql sql RDS servers and we are facing an issue which i am not able to resolve. The query that we are using is a simple selection of max date row from the data set. This is working fine on the first server, where as this is not working correctly on the other server even on the same data set.

The query we are using is below:

SELECT `name`,last_update_date FROM 
(SELECT * FROM `table1` ORDER BY last_update_date DESC) X  GROUP BY `name`;

I am not sure why this is happening. I checked for the global variables on both the servers and found the below variables which are not present in the new server:

binlogging_impossible_mode
innodb_additional_mem_pool_size
innodb_mirrored_log_groups
innodb_use_sys_malloc
simplified_binlog_gtid_recovery
sql_log_bin
storage_engine
thread_concurrency
timed_mutexes

Any help is appreciated.

user3454116
  • 184
  • 1
  • 12
  • You need to define "working correctly" as that query seems needlessly overcomplicated. – Uueerdo Dec 22 '16 at 19:14
  • You're misusing the notorious MySQL nonstandard extension to GROUP BY. https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html That extension *just loves* to give you unpredictable results. – O. Jones Dec 22 '16 at 19:16
  • Instead of comparing variables you should probably compare execution plans generated by EXPLAIN on both servers. http://dev.mysql.com/doc/refman/5.7/en/explain.html – Sergey Kovalev Dec 22 '16 at 19:30
  • @O.Jones I read the doc and it does provide a good explanation to the logic why it gives unpredictable results. So is this a bug or a notorious non standard extension. – user3454116 Dec 22 '16 at 19:38
  • @SergeyKovalev I checked the explain plan and on the server which gives the correct result I get 'Primary and Derived' select type, where as on the server where my result in not correct it just gives 'Simple' select type. This is where i want to understand the reason because everything is same. – user3454116 Dec 22 '16 at 19:41
  • @user3454116 it is not a bug, it just how the non-standard feature behaves (less than predictably); which is why many RDBMSes do not support that kind of feature. I believe the intent, which is still a good one in my opinion, was to support situations such as `SELECT a.*, SUM(b.somefield) FROM a JOIN b ON a.pkid = b.a_pkid GROUP BY a.pkid` where you are expected to only have one value for any ungrouped, non-aggregated field in the SELECT list for any "group"... any other uses are generally abuses. – Uueerdo Dec 22 '16 at 20:54
  • @Uueerdo thanks for the explanation of the working way. Do you have any inputs as to why the same query showed different select type in the explain plan. – user3454116 Dec 22 '16 at 22:36
  • My simplest guess would be the servers are different versions of MySQL, and different versions may interpret and optimize queries differently; especially in cases where the queries have behavior that is not well defined (such as queries that use features that do not guarantee consistent results). – Uueerdo Dec 22 '16 at 23:24
  • @Uueerdo I checked and your guess is bang on. One of the servers is 5.6.23 while the other is 5.7.16, which may the reason for the inconsistent behavior. – user3454116 Dec 22 '16 at 23:54

1 Answers1

4

This is not the correct way to get the max date for each name, and there's no reason to expect consistent results from it. The correct way is:

select name, MAX(last_update_date) as last_update_date
FROM table1
GROUP BY name

The fact that it seems to be working on one server is just coincidence, and it might not last.

If you want to get the entire row that contains the max date, see SQL Select only rows with Max Value on a Column

Community
  • 1
  • 1
Barmar
  • 741,623
  • 53
  • 500
  • 612