Select all but ignore duplicates

Question

I have a mysql table in which one field will hold duplicates. I am trying to select all but ignore all rows where a duplicate exists in this field.

So if for example I have 10 rows in total, and 3 of them have duplicates I like to return 8 rows. The 7 that were unique and 1 of the 3 duplicates.

I have tried distinct and group by without success. They ignore all 3 duplicates.

Here's what I tried:

SELECT *
FROM directory_listings
GROUP BY url
WHERE status = 'approved'
ORDER BY site_name ASC
LIMIT $start, $per_page

and

SELECT * DISTINCT url
FROM directory_listings
WHERE status = 'approved'
ORDER BY site_name ASC
LIMIT $start, $per_page

I need to return one of the duplicates for completeness. If I have records a, b, b, c, d then I want to return a, b, c, d — Ruf1, Mar 19 '14 at 15:22
@Ruf1 Are all columns the same value for the duplicates, or is it just a duplicate in a certain column? — Alex W, Mar 19 '14 at 15:28
@AlexW - it's just the column 'url' where there could be a duplicate — Ruf1, Mar 19 '14 at 15:30

score 7 · Accepted Answer · answered Mar 19 '14 at 15:44

7

@AlexW - it's just the column 'url' where there could be a duplicate – Ruf1 9 mins ago

Then your first query will work if you correct the syntax - GROUP BY must follow WHERE (per the docs):

SELECT *
FROM directory_listings
WHERE status = 'approved'
GROUP BY url
ORDER BY site_name ASC

Here's an example of a working query in SQL Fiddle.

answered Mar 19 '14 at 15:44

Air

8,274
2
53
88

i also need a similar query to ascertain the number of unique rows. would this be correct or do I use group: "SELECT COUNT(DISTINCT url) FROM directory_listings WHERE status = 'approved'" – Ruf1 Mar 19 '14 at 17:48
That should work fine; see also http://stackoverflow.com/q/4131937/2359271. If you used `GROUP BY` you would be getting separate counts for each group (which can be useful to find duplicates, as in [this example](http://sqlfiddle.com/#!2/eac2f8/10)!) – Air Mar 19 '14 at 18:09

score 1 · Answer 2 · answered Mar 19 '14 at 15:15

1

Your syntax for SELECT DISTINCT is wrong:

http://dev.mysql.com/doc/refman/5.6/en/select.html

Also, the only reason GROUP BY wouldn't work to eliminate duplicates is if the WHERE clause is disqualifying some of the rows (i.e. they are not duplicates in terms of both status and url).

answered Mar 19 '14 at 15:15

Alex W

37,233
13
109
109

+1 for a good point about the `WHERE` clause. IMO this answer would be much improved by giving the correct syntax explicitly instead of only linking the docs. I know when I was starting out with SQL, I found the syntax descriptions in the docs very confusing; and external links can be unstable. – Air Mar 19 '14 at 17:22

Select all but ignore duplicates

2 Answers2