0

I am trying to do a simple select query that will filter out duplicate data just like what was possible in mysql versions below 5.6.

I have done a lot of research online and i have tried everything that i have found but i still don't seem to get the gist.

CREATE TABLE Test (
  id int(11) unsigned NOT NULL AUTO_INCREMENT,
  Comment text,
  commentOnId int(11) DEFAULT NULL,
  Time timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

insert into test (comment, commentonid) 
VALUES
('My first comment', null),
    ('reply 1',1),
    ('reply 2',1);`

select id, comment, commentonid from Test group by commentonid;
select id, comment, distinct commentonid from Test group by commentonid;
select id, comment, distinct commentonid from Test;

select id, comment, ANY_VALUE(commentonid) as unique from Test group by unique;
select id, comment, distinct ANY_VALUE(commentonid) from Test;`

`select id, comment, MIN(commentonid) from Test;

Every single one of the select queries fail or at the very least don't get the desired result.

I want to get a unique commentOn field as there are two with the value of 1. What is the correct select statement to use in order to get unique data in mysql 8.0?

Barmar
  • 741,623
  • 53
  • 500
  • 612
Jevon
  • 295
  • 2
  • 13
  • Are you saying that these queries work in 5.6 but not in 8.0? What errors do you get? What have you tried to resolve the problem so far? Which aspects of the documentation back up your solution? – Lightness Races in Orbit Oct 09 '19 at 23:06

2 Answers2

0

You're using the aggregation function on the wrong column. You don't use ANY_VALUE() on the column you're grouping by, you use it on all the other columns that aren't uniquely determined by that column.

SELECT ANY_VALUE(id) AS id, ANY_VALUE(comment) AS comment, commentonid
FROM Test
GROUP BY commentonid

Note that this doesn't guarantee that id and comment will come from the same row. You could get

1  reply 2  1

as the result. If you want to get consistent a consistent row, you normally should specify some criteria for which row you want in the group, such as the highest id. See SQL select only rows with max value on a column for how to write such queries.

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Barmar i have one more quick question. How would i do a double group by in mysql 8. If i added another column to the insert and name it PosterId or SharedCommentid, How would i ensure that both commentonid and (PosterId or SharedCommentid) have unique data? – Jevon Oct 15 '19 at 11:05
  • `GROUP BY commentonid, posterid` just like you always have – Barmar Oct 15 '19 at 16:21
0

Thank you Barmar for your answer. I like how you chose the simplest and easiest to understand way of answering the question and then referred to a link that will give a more specific answer to the question. Using the link you gave me i was able to get an even more elaborate understanding of getting distinct data. to add on to your answer as you noted that your example doesn't guarantee that the different columns will come from the same row.

SELECT id AS aid, comment AS acomment, commentonid as acommentonid
FROM Test as a inner join (SELECT Max(id) AS bid
FROM Test
GROUP BY commentonid) as b ON a.id = b.bid;

An inner join is best useful for a specific criteria as it enables you to do a whole bunch of different functions inside. It will also ensure that the columns will come from the same row as a different row can simply be changed by changing the id as bid in the join.

Jevon
  • 295
  • 2
  • 13