1

Anyone can explain behind this logic in MySQL IN clause and help me understand this issue

I have a user table and this table users are belongs to one or many groups. The group table primary key reference is updated in users table by comma(,) separated values as follows

Query 1. SELECT * FROM user;
+---------+-----------+-------------------------+-----------+
| user_id | user_name | user_email              | group_id  |
+---------+-----------+-------------------------+-----------+
|       1 | suresh    | xxxx@yyyyyyyyyy.com     | 22        |
|       2 | sundar    | s7sundera@gmail.com     | 2         |
|       3 | tester    | xxxxxxxx@yyyyyyyyyy.com | 1,2,3,4   |
|       4 | gail      | zzzzzz@gmail.com        | 1,2,3,4,5 |
+---------+-----------+-------------------------+-----------+

If I use IN clause and group id value as 2 in MySQL I got only one result

Query 2. SELECT * FROM user WHERE group_id IN(2)
+---------+-----------+---------------------+----------+
| user_id | user_name | user_email          | group_id |
+---------+-----------+---------------------+----------+
|       2 | sundar    | s7sundera@gmail.com | 2        |
+---------+-----------+---------------------+----------+

If I use IN clause and group id value as (1,2) in MySQL I got three results

Query 3. SELECT * FROM user WHERE group_id IN(1,2)
+---------+-----------+-------------------------+-----------+
| user_id | user_name | user_email              | group_id  |
+---------+-----------+-------------------------+-----------+
|       2 | sundar    | s7sundera@gmail.com     | 2         |
|       3 | tester    | xxxxxxxx@yyyyyyyyyy.com | 1,2,3,4   |
|       4 | gail      | zzzzzz@gmail.com        | 1,2,3,4,5 |
+---------+-----------+-------------------------+-----------+

I want to get group id 2 users like following output but it is not working as expected

If I use this query I need to get query 3 results is it possible?

SELECT * FROM user WHERE group_id IN(2)
gen_Eric
  • 223,194
  • 41
  • 299
  • 337
Sundar
  • 4,580
  • 6
  • 35
  • 61

4 Answers4

3

This is too long to be a comment, but you need to reconsider your current table design. You should not be storing the group_id values as a comma separated list.

Your tables should be structured similar to the following:

create table user
(
    user_id int,  PK
    user_name varchar(50),
    user_email varchar(100)
);

create table groups
(
    group_id int, PK
    group_name varchar(10)
);

create table user_group
(
    user_id int,
    group_id int
);

The user_group table will have a Primary Key of both the user_id and the group_id so you cannot get duplicates and then these columns should be foreign keys to the respective tables. This table will allow you to have multiple groups for each user_id.

Then when you query your tables, the query will be:

select u.user_id, 
  u.user_name,
  u.user_email,
  g.group_id
from user u
inner join user_group ug 
  on u.user_id = ug.user_id
inner join groups g
  on ug.group_id = g.group_id

See SQL Fiddle with Demo.

If you needed to for display purposes show the group_id values in a comma separated list, you can use GROUP_CONCAT():

select u.user_id, 
  u.user_name,
  u.user_email,
  group_concat(g.group_id order by g.group_id) group_id
from user u
inner join user_group ug 
  on u.user_id = ug.user_id
inner join groups g
  on ug.group_id = g.group_id
group by u.user_id, u.user_name, u.user_email

See SQL Fiddle with Demo

If you redesign your tables, then when you search it becomes much easier:

select u.user_id, 
  u.user_name,
  u.user_email,
  g.group_id
from user u
inner join user_group ug 
  on u.user_id = ug.user_id
inner join groups g
  on ug.group_id = g.group_id
where g.group_id in (1, 2)

See SQL Fiddle with Demo

Taryn
  • 242,637
  • 56
  • 362
  • 405
  • This will sounds me clear to understanding the logic. But is it possible to achieve my expectation using any mysql inbuild functions. – Sundar Mar 14 '13 at 15:07
  • 1
    @Sundar You would have to use `FIND_IN_SET` but if possible, you are better off redesigning your tables. – Taryn Mar 14 '13 at 15:09
  • +1 because this is the correct way to store multiple values for a single column! :-) – gen_Eric Mar 14 '13 at 15:21
  • 1
    @RocketHazmat That's why I said in the beginning it was too long for a comment. I only hope they can change the design otherwise it will just lead to more headaches for them. :) – Taryn Mar 14 '13 at 15:23
  • @RocketHazmat this table structure is more comfortable for me in query performance if i normalize this structure the i have to join mapping table and have to do group by option. I need this comma(,) separated values in many places in project so each time i have to generate this comma separate values using GROUP_CONCAT. So i keep this comma separated as it is and generate one more mapping table it's too good for me. – Sundar Mar 15 '13 at 05:07
  • 1
    @Sundar: Whatever works for you, just know that searching a comma separated list in MySQL is *very* inefficient. – gen_Eric Mar 15 '13 at 13:30
2

When passing 1,2 to the IN operator, you're asking for 1 and 2; this is why it will return all three results. If you have a column with comma separated values, you're violating normal form; as each column should not contain more than one value. If you want to find a single value in a multi-valued comma separated column, then you can use FIND_IN_SET.

A normalized schema would look like:

+---------+-----------+-------------------------+
| user_id | user_name | user_email              |
+---------+-----------+-------------------------+
|       2 | sundar    | s7sundera@gmail.com     |
|       3 | tester    | xxxxxxxx@yyyyyyyyyy.com |
|       4 | gail      | zzzzzz@gmail.com        |
+---------+-----------+-------------------------+

+---------+-----------+
| user_id | group_id  |
+---------+-----------+
|       2 | 2         |
|       3 | 1         |
|       3 | 2         |
|       3 | 3         |
|       3 | 4         |
|       4 | 1         |
|       4 | 2         |
|       4 | 3         |
|       4 | 4         |
|       4 | 5         |
+---------+-----------+

+----------+
| group_id |
+----------+
|        1 |
|        2 |
|        3 |
|        4 |
+----------+
Kermit
  • 33,827
  • 13
  • 85
  • 121
  • Thanks for your reply. Sometimes I need to search two groups in this case FIND_IN_SET won't work like " SELECT * FROM user WHERE group_id FIND_IN_SET('2,3', group_id);" It is yielding empty result set – Sundar Mar 14 '13 at 14:59
  • 1
    @Sundar I would recommend that you read [this answer](http://stackoverflow.com/questions/4155873/find-in-set-vs-in) – Kermit Mar 14 '13 at 15:06
  • 1
    @Sundar `FIND_IN_SET` returns you a boolean, you don't do `group_id FIND_IN_SET()`. To search for multiple values, you need OR. `WHERE FIND_IN_SET(2, group_id) OR FIND_IN_SET(3, group_id);` Though storing comma separated lists in MySQL is a bad idea, I suggest using the "link table" instead. – gen_Eric Mar 14 '13 at 15:07
  • 1
    @AarolamaBluenk: `IN(1,2)` is returning three rows because each row is being converted to an `INT`, which means each row is becoming the first value in the list. Those three rows are the ones that start with `1` OR `2`. – gen_Eric Mar 14 '13 at 15:12
  • @RocketHazmat I like your explanation much better. – Kermit Mar 14 '13 at 15:16
  • @AarolamaBluenk: I only know that because of a question I asked (which you linked to ^_^) http://stackoverflow.com/questions/4155873/find-in-set-vs-in – gen_Eric Mar 14 '13 at 15:17
  • @RocketHazmat Let me +1 that too! – Kermit Mar 14 '13 at 15:20
2

MySQL doesn't treat comma separated lists as anything more than just a string. When you do WHERE group_id IN(2), it converts group_id to an INT, so it can compare it with 2.

When casting to an INT, MySQL stops at the first non-number character.

For example, '1,2,3,4,5' IN (2) becomes 1 IN (2). Which is FALSE.

You can try to use FIND_IN_SET to do what you want, but it's not very efficient (because it can't use indexes; it need to read every single row to see if it matches).

WHERE FIND_IN_SET(2, group_id)

To search for multiple rows, use OR.

WHERE FIND_IN_SET(1, group_id) OR FIND_IN_SET(2, group_id)

The correct way to do this, is to create a "link table" that contains one (or more) rows for each user, showing what group(s) they are in.

gen_Eric
  • 223,194
  • 41
  • 299
  • 337
  • @ChrisTravers: I only know this because I had a *similar* issue: http://stackoverflow.com/q/4155873/206403 – gen_Eric Mar 14 '13 at 15:13
1

EXPLANATION

What is the logic of the query SELECT * FROM user WHERE group_id IN(1,2); ?

  • You gave a list of numbers (1,2)
  • The groud_id was being compare numerically
  • Anything that numerically matched 1 or 2 up to the first comma came up as a result

SUGGESTION

What I am about to present to you may seem rather unorthodox but please follow me...

Here is the query that will get every row that has both 1 and 2 in group_ids:

SELECT user.* FROM
(SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
FROM user) U WHERE LOCATE(',2,',group_ids)) U1
INNER JOIN
(SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
FROM user) U WHERE LOCATE(',4,',group_ids)) U2
ON U1.id = U2.id
INNER JOIN user ON user.id = U2.id;

Here is the code create our sample data

DROP DATABASE IF EXISTS sundar;
CREATE DATABASE sundar;
use sundar
CREATE TABLE user
(
    id int not null auto_increment,
    user_name VARCHAR(30),
    user_email VARCHAR(70),
    group_id VARCHAR(128),
    PRIMARY KEY (id)
);
INSERT INTO user (user_name,user_email,group_id) VALUES
('suresh' , 'xxxx@yyyyyyyyyy.com'     ,'22'),
('sundar' , 's7sundera@gmail.com'     ,'2'),
('tester' , 'xxxxxxxx@yyyyyyyyyy.com' ,'1,2,3,4'),
('gail'   , 'zzzzzz@gmail.com'        ,'1,2,3,4,5');
SELECT * FROM user;

Let's create your sample

mysql> DROP DATABASE IF EXISTS sundar;
Query OK, 1 row affected (0.00 sec)

mysql> CREATE DATABASE sundar;
Query OK, 1 row affected (0.00 sec)

mysql> use sundar
Database changed
mysql> CREATE TABLE user
    -> (
    ->     id int not null auto_increment,
    ->     user_name VARCHAR(30),
    ->     user_email VARCHAR(70),
    ->     group_id VARCHAR(128),
    ->     PRIMARY KEY (id)
    -> );
Query OK, 0 rows affected (0.04 sec)

mysql> INSERT INTO user (user_name,user_email,group_id) VALUES
    -> ('suresh' , 'xxxx@yyyyyyyyyy.com'     ,'22'),
    -> ('sundar' , 's7sundera@gmail.com'     ,'2'),
    -> ('tester' , 'xxxxxxxx@yyyyyyyyyy.com' ,'1,2,3,4'),
    -> ('gail'   , 'zzzzzz@gmail.com'        ,'1,2,3,4,5');
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql>

and here is what it looks like

mysql> SELECT * FROM user;
+----+-----------+-------------------------+-----------+
| id | user_name | user_email              | group_id  |
+----+-----------+-------------------------+-----------+
|  1 | suresh    | xxxx@yyyyyyyyyy.com     | 22        |
|  2 | sundar    | s7sundera@gmail.com     | 2         |
|  3 | tester    | xxxxxxxx@yyyyyyyyyy.com | 1,2,3,4   |
|  4 | gail      | zzzzzz@gmail.com        | 1,2,3,4,5 |
+----+-----------+-------------------------+-----------+
4 rows in set (0.00 sec)

mysql>

Again, here is the messy query that will get what you want:

SELECT user.* FROM
(SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
FROM user) U WHERE LOCATE(',1,',group_ids)) U1
INNER JOIN
(SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
FROM user) U WHERE LOCATE(',2,',group_ids)) U2
ON U1.id = U2.id
INNER JOIN user ON user.id = U2.id;

Here it is executed:

mysql> SELECT user.* FROM
    -> (SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
    -> FROM user) U WHERE LOCATE(',1,',group_ids)) U1
    -> INNER JOIN
    -> (SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
    -> FROM user) U WHERE LOCATE(',2,',group_ids)) U2
    -> ON U1.id = U2.id
    -> INNER JOIN user ON user.id = U2.id;
+----+-----------+-------------------------+-----------+
| id | user_name | user_email              | group_id  |
+----+-----------+-------------------------+-----------+
|  3 | tester    | xxxxxxxx@yyyyyyyyyy.com | 1,2,3,4   |
|  4 | gail      | zzzzzz@gmail.com        | 1,2,3,4,5 |
+----+-----------+-------------------------+-----------+
2 rows in set (0.00 sec)

mysql>

OK, how about looking for (2,4) ?

mysql> SELECT user.* FROM
    -> (SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
    -> FROM user) U WHERE LOCATE(',2,',group_ids)) U1
    -> INNER JOIN
    -> (SELECT * FROM (SELECT id,CONCAT(',',group_id ,',') group_ids
    -> FROM user) U WHERE LOCATE(',4,',group_ids)) U2
    -> ON U1.id = U2.id
    -> INNER JOIN user ON user.id = U2.id;
+----+-----------+-------------------------+-----------+
| id | user_name | user_email              | group_id  |
+----+-----------+-------------------------+-----------+
|  3 | tester    | xxxxxxxx@yyyyyyyyyy.com | 1,2,3,4   |
|  4 | gail      | zzzzzz@gmail.com        | 1,2,3,4,5 |
+----+-----------+-------------------------+-----------+
2 rows in set (0.00 sec)

mysql>

Looks like it works.

Give it a Try !!!

Community
  • 1
  • 1
RolandoMySQLDBA
  • 43,883
  • 16
  • 91
  • 132