1

hello friends i am working on a query to get a list of registry, but i need get duplicate entries, and preserve only that have more length

Example table:

ID     = INT Auto increment
IDref  = INT not null
UserId = INT ZeroFill

+--------+---------+--------+---------+
| ID     | IDref   | UserId |         |
+--------+---------+--------+---------+
| 1      | 1       | 00001  |         |
| 3      | 1       | 001    |         |
| 4      | 1       | 002    |         |
| 5      | 2       | 00001  |         |
| 6      | 2       | 001    |         |
| 7      | 2       | 002    |         |
| 8      | 3       | 00001  |         |
| 9      | 4       | 002    |         |
+--------+---------+--------+---------+

The Query

SELECT ID, IDref, UserId 
FROM Table 
WHERE UserId = '00001' OR '002' 
GROUP BY IDref 
HAVING count(IDref) > 0 
ORDER BY CHAR_LENGTH(UserId ) DESC 
LIMIT 1

Update I am Geting

+--------+---------+--------+---------+
| ID     | IDref   | UserId |         |
+--------+---------+--------+---------+
| 4      | 1       | 002    |         |
+--------+---------+--------+---------+

i required this output:

+--------+---------+--------+---------+
| ID     | IDref   | UserId |         |
+--------+---------+--------+---------+
| 1      | 1       | 00001  |         |
| 5      | 2       | 00001  |         |
| 8      | 3       | 00001  |         |
| 9      | 4       | 002    |         |//count(IDref) > 0// one registry find.
+--------+---------+--------+---------+

But not work my Query. someone can help me plz.

Last Update with working code Thank Barmar:

SELECT t1.ID, t1.IDref, t1.UserId
FROM Table1 AS t1
JOIN (SELECT IDref, MAX(LENGTH(UserId)) AS maxlength
      FROM Table1
      WHERE UserID IN ('00001', '002')
      GROUP BY IDref) AS t2
ON t1.IDref = t2.IDref AND LENGTH(t1.UserId) = t2.maxlength
WHERE t1.UserId IN ('00001', '002')

ok , but may you see the performance of this query in a table with 20,000 records will be faster?

1 Answers1

0
WHERE UserId = '00001' OR '002'

is not the correct way to test if a column matches either of these values. The correct way is

WHERE UserId IN ('00001', '002')

Your code is equivalent to

WHERE (UserId = '00001') OR '002'

which is always true because '002' is not false.

You also should remove LIMIT 1. Otherwise, you only get one of the IDRef that has duplicates.

ORDER BY LENGTH(UserId) doesn't make the grouped value select the row with the longest length -- ordering is done after grouping, and grouping just selects one of the values arbitrarily. For that, you need to use one of the techniques in SQL Select only rows with Max Value on a Column

SELECT t1.ID, t1.IDref, t1.UserId
FROM Table1 AS t1
JOIN (SELECT IDref, MAX(LENGTH(UserId)) AS maxlength
      FROM Table1
      WHERE UserID IN ('00001', '002')
      GROUP BY IDref) AS t2
ON t1.IDref = t2.IDref AND LENGTH(t1.UserId) = t2.maxlength
WHERE t1.UserId IN ('00001', '002')

There's no need for HAVING COUNT(IDRef) > 0, because count is always more than 0.

DEMO

Community
  • 1
  • 1
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • if of course you're right , the problem is with the "Where IN" and the filter of duplicates and the length characters . becouse it continue showing, because the result is still showing me the less string and not the larger –  Apr 29 '16 at 21:54
  • Updated the answer to show how to get the longest user ID. – Barmar Apr 29 '16 at 22:01
  • i get this error: Unknown column 't2.IDref' in 'on clause' –  Apr 29 '16 at 22:12
  • I couldn't reproduce that error, but I did have some other problems that I've fixed. See the new code. – Barmar Apr 29 '16 at 22:20
  • this work but ok , but may you see the performance of this query in a table with 20,000 records will be faster? –  Apr 29 '16 at 22:36
  • Unfortunately, `LENGTH()` can't use an index, so this query will be slow. I don't think there's any way to do faster, though. – Barmar Apr 29 '16 at 22:39