2

Here is my pseudo query:

SELECT col1, col2 FROM table WHERE number IN(number1,number2,number3);

EXAMPLE

SELECT name, description FROM products WHERE 5 IN(category_id);

NOTE: Let's assume that one of the rows in the table has category_id = 2,5,7

THE PROBLEM

my query works if the row has "5" at the beginning of the comma separated set like this: 5,2,7 BUT same query does not return any result if the row starts with anything other than 5.

I tried various scenario, it always appears to me that when mysql encounters the comma, it is no longer checking for matches further into the comma separated numbers.

I checked everything that made sense such as properly formatted comma separated strings and collation. It is puzzling me at this point. Any solutions?

Please note I used FIND_IN_SET() and that worked flawlessly. But still, I'm not satisfied with ditching IN() without knowing what is the real issue. Why does it stop at the first encounter f a comma.

user2864740
  • 60,010
  • 15
  • 145
  • 220

3 Answers3

3

IN works on a set of values, not a single comma-separated string value. Each separate value supplied to IN is one element in the set.

MySQL offers a non-standard FIND_IN_SET(str, strlist) function which can be used here, although proper relational database design would be to normalize the field.

Returns a value in the range of 1 to N if the string str is in the string list strlist consisting of N substrings. A string list is a string composed of substrings separated by “,” characters.

Example:

WHERE FIND_IN_SET('5', category_id)

One issue, besides breaking normalization and throwing referential integrity out the window, is that FIND_IN_SET cannot use indices and is thus not scalable when used as a high-cardinality selector.

See also FIND_IN_SET() vs IN()

Community
  • 1
  • 1
user2864740
  • 60,010
  • 15
  • 145
  • 220
3

Why does it work on the first value only?

In order to compare a string value to a numeric column, mysql converts the single string value to a single number number. When mysql converts a string to a number, it discards everything from the first non-numeric character onwards, eg:

  • "1234,5678" --> 1234
  • "abc5678" --> 0

That's why only the first number is works.

If you change your string value to a list of string values, it would work:

 SELECT name, description FROM products WHERE 5 IN ('2', '5', '7');
Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • 1
    +1. My answer entirely missed the issue of strings being evaluated in a numeric context. The examples in my answer were all based on VARCHAR comparisons. (I still wish there was a sql_mode I could set to have MySQL generate warnings about implicit datatype conversions.) – spencer7593 Sep 29 '14 at 02:37
2

EDIT The expression examples in this answer are based entirely on a string context, that is, comparison of character datatypes. These examples do NOT take into account implicit datatype conversions, and the quirky MySQL semantics of string to numeric conversion, e.g. '2,5,7'+0 evaluates to integer value 2. See the excellent answer from Bohemian for a demonstration of that behavior.)


The row in the table has category value '2,5,7'. That's a string value.

The expression:

'5' IN ('2,5,7') 

Is equivalent to

'5' = '2,5,7'

The commas within the string value are not seen as SQL text, those are characters in the string.

To get the result you are looking for with IN, you'd need an expression like this:

'5' IN ('2','5','7')

That is three separate values, separated by commas that are part of the SQL text. That's equivalent to:

( '5' = '2'  OR '5' = '5' OR '5' = '7' )

To answer the question you asked:

Q: Why does it stop at the first encounter of a comma?

A: It doesn't stop at the first comma. It compares the whole string, as a single string. You'd get the same result with this expression.

'5' IN ('5,2,7') 

That will return FALSE, because it's equivalent to the expression '5' = '5,2,7'`, and the two strings being compared are not equal.

(EDIT: Th example above is based on string comparison. In a numeric context, the string '5,2,7' would evaluate to a numeric value of 5.

In that case, it's still not the IN that's stopping at the first comma, it's the implicit conversion from string to numeric that's "stopping at the first comma". (It's not just the comma, it's any character that's encountered where the string can no longer be converted into a numeric value, and that could be a paren, a '#', a 'b' or whatever.)

Bottom Line: The IN comparison operator doesn't give a rat's @ss about the comma characters within the string. Those are just characters within the string. The commas within a string value are not interpreted as part of the SQL text.

spencer7593
  • 106,611
  • 15
  • 112
  • 140