5

I'm querying on this table:

SKU
aaaa
bbbb
bbbb
NULL

Here's the query:

select *
from TEST as N                                      
where  N.SKU NOT IN (select SKU
                     from TEST
                     group by SKU
                     having count(*)>1);

I expect the query returns 'aaaa', however, it returns nothing.

The reason I expect that is because the subquery below only returns 'bbbb':

select SKU
from TEST
group by SKU
having count(*)>1   

Therefore, 'aaaa' NOT IN the subquery result.

To show the bug please copy and paste these statements in your MySQL IDE to create schema:

drop table if exists TEST;
create table TEST(
SKU varchar(255)
);
insert into TEST values('aaaa'),('bbbb'),('bbbb'),(NULL);
lc.
  • 113,939
  • 20
  • 158
  • 187
Shawn
  • 61
  • 6
  • 1
    Reproduced in SQL Fiddle: http://sqlfiddle.com/#!9/769ffa/8 – lc. Feb 27 '18 at 03:42
  • And for the record SQL Server behaves as one might expect (unless I'm missing something too): http://sqlfiddle.com/#!18/0e50d/1 – lc. Feb 27 '18 at 03:43
  • 1
    Ok and so apparently it's something with the NULL in the table. Removing the NULL returns the 'aaaa' row: http://sqlfiddle.com/#!9/93efbf/1 – lc. Feb 27 '18 at 03:44
  • 2
    https://stackoverflow.com/a/129151/491243 – John Woo Feb 27 '18 at 03:46
  • @lc. thanks for the effort of editing and reproducing in different sql flavors. I was just wondering, since NULL has been discarded by attaching having clause. why does it still affect the final result. very confusing. – Shawn Feb 27 '18 at 03:51
  • 1
    @John Woo subquery doesn't return NULL, so NOT IN is not able to compare against NULL in this case. – Shawn Feb 27 '18 at 03:54
  • 1
    @Shawn No problem and I honestly don't know. Replacing the subquery with a list of string constants `NOT IN ('bbbb')` *does* work, so it's something the optimizer is probably doing internally. – lc. Feb 27 '18 at 04:02
  • @lc. True. Maybe the optimizer somehow pulls the having clause out of subquery, that way it returns nothing for sure, but I'm just guessing. I'll keep an eye on this post, see if anyone else will make a follow-up. – Shawn Feb 27 '18 at 04:08
  • 1
    Maybe you could ask https://dba.stackexchange.com/ ; that's an interesting problem. – Déjà vu Feb 27 '18 at 09:26
  • 1
    @Ring Ø good suggestion. https://dba.stackexchange.com/questions/198945/mysql-bug-not-in-subquery-using-group-by-having-returns-nothing – Shawn Feb 27 '18 at 18:38

1 Answers1

0

This works OK, if subquery returns only one row:

SELECT * 
FROM TEST as N 
WHERE N.SKU NOT IN (
  CAST((select SKU from TEST group by SKU having count(*)>1) AS CHAR)
)

However, if it returns more then 1 row, error will be triggered. Another solution (if the subquery returns more than 1 row):

SELECT * 
FROM TEST as N 
WHERE N.SKU IN (
  (select SKU from TEST group by SKU having count(*) <= 1)
)
Boolean_Type
  • 1,146
  • 3
  • 13
  • 40