Split column by comma then show all values that appear in all rows?

Question

Table:

Input would be a list of item_id like 2,3,4 and output for that would be all the store_id that are in found for ALL items... so: 102,104

Another example input: 1,2 and output: 102

Does MySQL offer such operations?

First I'd need to do a select of the rows with the input. Then I could potentially put all store_ids from the select into 1 variable and find any store_ids that appear n times (n being the count of the input)?

Why my question is different

I'm not asking whether or not a delimited list is good or bad or anything related to an opinion of that. I'm asking, how do I work with it to get my result?

It is really unclear what you are asking, but it seems like a bad database design — nacho, Jun 02 '19 at 15:55
@User Since you are aware of https://stackoverflow.com/questions/3653462/is-storing-a-delimited-list-in-a-database-column-really-that-bad, why do you not change the table design as suggested? — Progman, Jun 02 '19 at 17:01
You say you're not asking about whether a delimited list is good or not...but it really is not good, and if the advice is to change your design then that's what you should do. Then it'll be much easier for people to help you using established techniques for working with correctly designed databases, rather than workarounds for bad designs. You've only got this problem because the design is flawed in the first place. — ADyson, Jun 02 '19 at 17:33
It's equivalent to saying "I'm not asking whether it's good to use French in a Spanish shop; I'm asking how do I use French in a Spanish shop in order to purchase the things I want." There's nothing wrong with French, and there's nothing wrong with Spanish; but for best results, don't use French in a Spanish shop. — Strawberry, Jun 02 '19 at 23:41
@Progman because other software uses this table and that's not my task. my task is design a new software with the existing database structure. — User, Jun 02 '19 at 23:42
@Strawberry point is they are not the same, aka not a duplicate question. — User, Jun 02 '19 at 23:44

Bill Karwin · Answer 1 · 2019-06-02T20:58:59.187

You're basically asking to find the intersection of N comma-separated lists, from the set of lists found by looking up N item_id's.

The way you're storing comma-separated lists really does make this harder.

To solve this puzzle, I can think of a way:

Create a table with every store_id.

mysql> create table stores (store_id int primary key);

mysql> insert into stores values (101), (102), (103), (104);

Now you can transform the set of stores in your comma-separated list into a number of rows, based on matching the CSV to the individual values in the stores table.

mysql> select * from mytable as t join stores as s on find_in_set(s.store_id, t.store_id)
  where item_id in (2,3,4);
+---------+-----------------+----------+
| item_id | store_id        | store_id |
+---------+-----------------+----------+
|       3 | 101,102,103,104 |      101 |
|       2 | 102,104         |      102 |
|       3 | 101,102,103,104 |      102 |
|       4 | 102,103,104     |      102 |
|       3 | 101,102,103,104 |      103 |
|       4 | 102,103,104     |      103 |
|       2 | 102,104         |      104 |
|       3 | 101,102,103,104 |      104 |
|       4 | 102,103,104     |      104 |
+---------+-----------------+----------+

Then group these rows by store_id. Those groups that have a count of N, given that you're searching for N item_id's, are the ones that meet the criteria.

mysql> select s.store_id, count(*) 
  from mytable as t join stores as s on find_in_set(s.store_id, t.store_id) 
  where item_id in (2,3,4) 
  group by s.store_id;
+----------+----------+
| store_id | count(*) |
+----------+----------+
|      101 |        1 |
|      102 |        3 |
|      103 |        2 |
|      104 |        3 |
+----------+----------+

You can filter the groups down to the ones you want by using a HAVING clause.

mysql> select s.store_id, count(*) 
  from mytable as t join stores as s on find_in_set(s.store_id, t.store_id)
  where item_id in (2,3,4) 
  group by s.store_id 
  having count(*) = 3;
+----------+----------+
| store_id | count(*) |
+----------+----------+
|      102 |        3 |
|      104 |        3 |
+----------+----------+

The problem with this is not only does it require you to have another table that lists all your store_ids, but the FIND_IN_SET() function is inefficient and can't be optimized with an index. It's bound to do a cross-join, which means for X items and Y stores, it'll do a table-scan of X items, and for each item, it'll do a table-scan of Y stores. As you get more stores and items, the performance will get worse in proportion to X * Y.

This is why it's often a bad idea to store comma-separated lists of id's, when you really need to do queries that compare to individual id's within those strings.

What you should have is three tables:

one table for items
one table for stores
one table that enumerates every pair of item & store

Example:

mysql> create table store_has_item (item_id int, store_id int, primary key (item_id, store_id));

Fill this with one row per pair of item & store:

mysql> insert into store_has_item
    -> select t.item_id, s.store_id from mytable as t join stores as s on find_in_set(s.store_id, t.store_id);
Query OK, 15 rows affected (0.02 sec)
Records: 15  Duplicates: 0  Warnings: 0

Now you can query just this single table for N items, group by store_id, and return those groups with a count of N:

mysql> select store_id from store_has_item where item_id in (2,3,4)
  group by store_id having count(*) = 3;
+----------+
| store_id |
+----------+
|      102 |
|      104 |
+----------+

Toxi · Answer 2 · 2019-06-03T08:26:34.987

Does MySQL offer such operations?

Of course, if i understand right, you want only select all possible store id get by every item_id, this u can realize by an easy statement:

i´ve not found your database layout so you have to handle it to your needs:

SELECT 
    a.id, /* These id´s are your item_id´s */
    GROUP_CONCAT(b.store_id) as store_ids /* These line puts your store_ids into one output */
FROM item_id a
JOIN store_id b ON b.item_id = a.id 
GROUP BY item_id

This statement returns something like this:

+-----------+----------------------+
| item_id | store_ids |
+-----------+----------------------+
| 1 | 120,210,330 |
| 2 | 90,10,30 |
| 3 | 70,120,90,330 |
+-----------+----------------------+

Greetz Toxi

Split column by comma then show all values that appear in all rows?

2 Answers2