Exclude duplicates in mysql 8.0

Question

before upgrading to MySQL 8.0 I used to use GROUP BY to do not select duplicates but now I'm getting:

1055 - Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'database.orders.ID' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by, Time: 0.028000s

I know that I could change mysql mode as follows:

mysql -u root -p
mysql > SET GLOBAL sql_mode=(SELECT REPLACE(@@sql_mode,'ONLY_FULL_GROUP_BY',''));

But I would like to keep the current mode but fix the problem.

This MySQL query Works:

SELECT Country FROM orders GROUP BY Country


+---------+
| Country |
+---------+
|  USA    |
|  UK     | 
+---------+

But when I add another MySQL column such as ID:

mysql> SELECT Country,ID FROM orders GROUP BY Country;

The error appears, as someone suggest before I have to add all select columns to GROUP BY so the query becomes

SELECT Country,ID FROM orders GROUP BY Country, ID

But that doesn't remove the duplicates and it shows all the countries.

Expected Result is:

mysql> SELECT Country,ID,userID FROM orders GROUP BY Country;

+------+---------+------------+
| ID   | Country | UserID     |
+------+---------+------------+
|   51 | USA     | 99         |
|   61 | UK      | 12         |
+------+---------+------------+

Note: columns names are just for simplicity and in my case each "Country" has same "UserID"

You need to tell us _which_ `ID` value you want to show for each country, because most likely a given country would have more than one unique `ID` value. This conundrum is the source of the error in MySQL. As to why you are seeing this suddenly, it is because the new version of MySQL you are using is running in a different server mode. — Tim Biegeleisen, Mar 11 '19 at 13:06
for remove duplicates you should use distinct and not group by — ScaisEdge, Mar 11 '19 at 13:10
Aha, thanks for declaring that, Now I can understand what the problem is, for me it doesn't really matter. how can I tell mysql to give me the last matched row. — Jennifer, Mar 11 '19 at 13:12
@scaisEdge I tried with distinct but adding another column doesn't remove the duplicates since the country and the ID in a row are unique. — Jennifer, Mar 11 '19 at 13:14
the duplicate .. in sql is evaluated for rows .. (for each values in a row).. show waht do you mean for duplicated? update your question add a proper data sample and the expected result — ScaisEdge, Mar 11 '19 at 13:16
@forpas it's just an example, in my case the `Country` is a transaction hash and the ID is a userID. there are some other columns that I don't need in this query so I need to get only non-duplicates. — Jennifer, Mar 11 '19 at 13:19
@Jennifer So for each country there are multiple userids, right? You want as a result for each country 1 userid?!!! Which userid? — forpas, Mar 11 '19 at 13:27
No, it's just an example, for each country it's the same userID — Jennifer, Mar 11 '19 at 13:31
Just select `min(ID)` or some other aggregation: problem solved. — trincot, Mar 11 '19 at 13:32
@trincot can you explain more, I don't want just to solve the problem without understanding. why using min(ID) solves the problem? — Jennifer, Mar 11 '19 at 13:33
The error message explains it all. Your `id` is not in the `group by` clause, so either it must be functionally dependent (which it isn't according to your db schema) or it must be an aggregation. So go for the aggregation. — trincot, Mar 11 '19 at 13:36
@trincot You are right sir, please post that as an answer since it solves my problem and I do understand what's the problem. Thanks for help guys — Jennifer, Mar 11 '19 at 13:45

score 3 · Accepted Answer · answered Mar 11 '19 at 13:44

You need to modify your SELECT clause so it uses: SELECT Country,ANY_VALUE(ID)

A more detailed information on why this happens is present on the MySQL guide.

To simplify the reason, you are groupping only by Column A, but you also want to SELECT Column B values. During the groupping process, MySQL will collapse all duplicated rows until the unique Column A values remain. However, for a single Column A value, there might be multiple different Column B values. MySQL doesn't know their priority and in order to collapse these multiple values into a single, scalar value, you need to tell it which value from the multiple Column B values you want to select.

Either the MIN() one, the MAX() one or ANY_VALUE() if you are sure that Column B values are duplicated anyway and any one of them will work. In prior versions of MySQL, the engine would silently ignore such discrepancies and apply the ANY_VALUE() logic by default.

Thanks for your answer, ANY_VALUE what I was looking for, since the ID and other columns aren't important in my case. — Jennifer, Mar 11 '19 at 13:49

Exclude duplicates in mysql 8.0

1 Answers1