How to use FIND_IN_SET using list of data

Question

I have used FIND_IN_SET multiple times before but this case is a bit different.

Earlier I was searching a single value in the table like

SELECT * FROM tbl_name where find_in_set('1212121212', sku)

But now I have the list of SKUs which I want to search in the table. E.g

'3698520147','088586004490','868332000057','081308003405','088394000028','089541300893','0732511000148','009191711092','752830528161'

I have two columns in the table SKU LIKE 081308003405 and SKU Variation In SKU column I am saving single value but in variation column I am saving the value in the comma-separated format LIKE 081308003405,088394000028,089541300893

SELECT * FROM tbl_name 
WHERE 1 
AND upc IN ('3698520147','088586004490','868332000057','081308003405','088394000028',
'089541300893','0732511000148','009191711092','752830528161')

I am using IN function to search UPC value now I want to search variation as well in the variation column. This is my concern is how to search using SKU list in variation column

For now, I have to check in the loop for UPC variation which is taking too much time. Below is the query

SELECT id FROM products 
WHERE 1 AND upcVariation AND FIND_IN_SET('88076164444',upc_variation) > 0

Well, I have found a way to do this thing in one query. I am using `OR ` clause to check all data in single query and it works faster than in for loop — jayant rawat, Feb 21 '20 at 05:21
Add your tables in SQL fiddle http://sqlfiddle.com/ and also post your solution in answer, it would be helpful for others. — er.irfankhan11, Feb 21 '20 at 11:40
Is `sku` a single value? Or is it a commalist? If it is potentially a list, please make the column name plural (eg, `skus` or `sku_list`). Else flip the arguments in the first example. — Rick James, Feb 21 '20 at 18:48

Paul Spiegel · Answer 1 · 2020-02-22T10:27:55.893

First of all consider to store the data in a normalized way. Here is a good read: Is storing a delimited list in a database column really that bad?

Now - Assumng the following schema and data:

create table products (
  id int auto_increment,
  upc varchar(50),
  upc_variation text,
  primary key (id),
  index (upc)
);
insert into products (upc, upc_variation) values
  ('01234', '01234,12345,23456'),
  ('56789', '45678,34567'),
  ('056789', '045678,034567');

We want to find products with variations '12345' and '34567'. The expected result is the 1st and the 2nd rows.

Normalized schema - many-to-many relation

Instead of storing the values in a comma separated list, create a new table, which maps product IDs with variations:

create table products_upc_variations (
  product_id int,
  upc_variation varchar(50),
  primary key (product_id, upc_variation),
  index  (upc_variation, product_id)
);
insert into products_upc_variations (product_id, upc_variation) values 
  (1, '01234'),
  (1, '12345'),
  (1, '23456'),
  (2, '45678'),
  (2, '34567'),
  (3, '045678'),
  (3, '034567');

The select query would be:

select distinct p.*
from products p
join products_upc_variations v on v.product_id = p.id
where v.upc_variation in ('12345', '34567');

As you see - With a normalized schema the problem can be solved with a quite basic query. And we can effectively use indices.

"Exploiting" a FULLTEXT INDEX

With a FULLTEXT INDEX on (upc_variation) you can use:

select p.*
from products p
where match (upc_variation) against ('12345 34567');

This looks quite "pretty" and is probably efficient. But though it works for this example, I wouldn't feel comfortable with this solution, because I can't say exactly, when it doesn't work.

Using JSON_OVERLAPS()

Since MySQL 8.0.17 you can use JSON_OVERLAPS(). You should either store the values as a JSON array, or convert the list to JSON "on the fly":

select p.*
from products p
where json_overlaps(
  '["12345","34567"]',
  concat('["', replace(upc_variation, ',', '","'), '"]')
);

No index can be used for this. But neither can for FIND_IN_SET().

Using JSON_TABLE()

Since MySQL 8.0.4 you can use JSON_TABLE() to generate a normalized representation of the data "on the fly". Here again you would either store the data in a JSON array, or convert the list to JSON in the query:

select distinct p.*
from products p
join json_table(
  concat('["', replace(p.upc_variation, ',', '","'), '"]'),
  '$[*]' columns (upcv text path '$')
) v
where v.upcv in ('12345', '34567');

No index can be used here. And this is probably the slowest solution of all presented in this answer.

RLIKE / REGEXP

You can also use a regular expression:

select p.*
from products p
where p.upc_variation rlike '(^|,)(12345|34567)(,|$)'

See demo of all queries on dbfiddle.uk

score 0 · Answer 2 · edited Feb 21 '20 at 05:51

0

You can try with below example:

SELECT * FROM TABLENAME 
WHERE 1 AND ( FIND_IN_SET('3698520147', SKU) 
        OR UPC IN ('3698520147') )

edited Feb 21 '20 at 05:51

Arun Palanisamy

5,281
6
28
53

answered Feb 21 '20 at 05:37

Montu Khant

25
7

I have already done the solution as I have mentioned below the question. By the way thanks for the answer – jayant rawat Feb 21 '20 at 05:57
The syntax works, but performance is poor -- no index can be useful in your example -- it involves two different columns. Hence, it will involve a full table scan. – Rick James Feb 21 '20 at 18:45

score 0 · Answer 3 · answered Feb 21 '20 at 06:01

0

I have a solution for you, you can consider this solution:

1: Create a temporary table example here: Sql Fiddle

select
  tablename.id,
  SUBSTRING_INDEX(SUBSTRING_INDEX(tablename.name, ',', numbers.n), ',', -1) sku_variation
from
  numbers inner join tablename
  on CHAR_LENGTH(tablename.sku_split)
     -CHAR_LENGTH(REPLACE(tablename.sku_split, ',', ''))>=numbers.n-1
order by id, n

2: Use the temporary table to filter. find in set with your data

answered Feb 21 '20 at 06:01

HoangHieu

2,802
3
28
44

How can use temporary table, without the stored procedure? – er.irfankhan11 Feb 21 '20 at 06:17
Just create a memory table using SQL query. – HoangHieu Feb 21 '20 at 08:29
CREATE TEMPORARY TABLE new_tbl SELECT * FROM orig_tbl LIMIT 0; – HoangHieu Feb 21 '20 at 08:30

Rick James · Answer 4 · 2020-02-21T18:39:24.597

Performance considerations. The main thing that matters for performance is whether some index can be used. The complexity of the expression has only a minuscule impact on overall performance.

Step 1 is to learn what can be optimized, and in what way:

Equal:        WHERE x = 1  -- can use index
IN/1:         WHERE x IN (1) -- Turned into the Equal case by Optimizer
IN/many:      WHERE x IN (22,33,44)  -- Usually worse than Equal and better than "range"
Easy OR:      WHERE (x = 22 OR x = 33) -- Turned into IN if possible
General OR:   WHERE (sku = 22 OR upc = 33) -- not sargable (cf UNION)
Easy LIKE:    WHERE x LIKE 'abc' -- turned into Equal
Range LIKE:   WHERE x LIKE 'abc%' -- equivalent to "range" test
Wild LIKE:    WHERE x LIKE '%abc%' -- not sargable
REGEXP:       WHERE x RLIKE 'aaa|bbb|ccc' -- not sargable
FIND_IN_SET:  WHERE FIND_IN_SET(x, '22,33,44')  -- not sargable, even for single item
JSON:         -- not sargable
FULLTEXT:     WHERE MATCH(x) AGAINST('aaa bbb ccc')  -- fast, but not equivalent
NOT:          WHERE NOT ((any of the above)) -- usually poor performance

"Sargable" -- able to use index. Phrased differently "Hiding the column in a function call" prevents using an index.

FULLTEXT: There are many restrictions: "word-oriented", min word size, stopwords, etc. But it is very fast when it applies. Note: When used with outer tests, MATCH comes first (if possible), then further filtering will be done without the benefit of indexes, but on a smaller set of rows.

Even when an expression "can" use an index, it "may not". Whether a WHERE clause makes good use of an index is a much longer discussion than can be put here.

Step 2 Learn how to build composite indexes when you have multiple tests (WHERE ... AND ...):

When constructing a composite (multi-column) index, include columns in this order:

'Equal' -- any number of such columns.
'IN/many' column(s)
One range test (BETWEEN, <, etc)

(A couple of side notes.) The Optimizer is smart enough to clean up WHERE 1 AND .... But there are not many things that the Optimizer will handle. In particular, this is not sargable: `AND DATE(x) = '2020-02-20', but this does optimize as a "range":

    AND x >= '2020-02-20'
    AND x  < '2020-02-20' + INTERVAL 1 DAY

Reading

Building indexes: http://mysql.rjweb.org/doc.php/index_cookbook_mysql
Sargable: https://en.wikipedia.org/wiki/Sargable
Tips on Many-to-many: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table

Justin Levene · Answer 5 · 2022-05-31T11:27:49.997

This depends on how you use it. In MySQL I found that find_in_set is way faster than using JSON when tested on the following commands, so much faster it wasn't even a competition (to be clear, the speed test did not include the set command line):

Fastest

set @ids = (select group_concat(`ID`) from `table`);
select count(*) from `table` where find_in_set(`ID`, @ids);

10 x slower

set @ids = (select json_arrayagg(`ID`) from `table`);
select count(*) from `table` where `ID` member of( @ids );

34 x slower

set @ids = (select json_arrayagg(`ID`) from `table`);
select count(*) from `table` where JSON_CONTAINS(@ids, convert(`ID`, char));

34 x slower

set @ids = (select json_arrayagg(`ID`) from `table`);
select count(*) from `table` where json_overlaps(@ids, json_array(`ID`));

score -1 · Answer 6 · edited Feb 28 '20 at 07:58

-1

SELECT * FROM tbl_name t1,(select 

group_concat('3698520147',',','088586004490',',','868332000057',',',
'081308003405',',','088394000028',',','089541300893',',','0732511000148',',','009191711092',

',','752830528161') as skuid)t

WHERE FIND_IN_SET(t1.sku,t.skuid)>0

edited Feb 28 '20 at 07:58

Bill Tür stands with Ukraine

3,425
30
38
48

answered Feb 27 '20 at 10:22

Siva Koteswara Rao

129
1
6