0

I have a MySQL table with a VARCHAR column VISITOR_ID that saves multiple or single visitor id values. I want the ENTERED_TEXT to have a 1 to many relationship, and don't want to make multiple rows for that relationship to the VISITOR_ID - So I elected to add multiple id's to the VISITOR_ID column. I now want to search that column for multiple related id's.

| VISITOR_ID  |      ENTERED_TEXT        |
------------------------------------------
| 123,133,777 |       text text          |
| 555         |      text text text      |
| 444,133,777 |          text            |
| 999         | text text text text text |

When I try to use a select with IN()

SELECT *

FROM `My_Table` 
WHERE
VISITOR_ID IN(444,777)

I only get back one row and an error:

| 444,133,777 |

Warning: #1292 Truncated incorrect DOUBLE value '123,133,777'

I expected these two rows returned:

| 444,133,777 |

| 123,133,777 |

Yet:

If I use:

SELECT *
    FROM `My_Table` 
    WHERE
    VISITOR_ID IN(555,999)

I get back the 2 rows that have the single values in them:

| 555         |
| 999         |

I think I understand that the comma separated values are not being seen as separate values by my IN() clause, but are being seen like this '123,133,777' instead - but how would I have these values read as separate values appropriate for my IN() search?

I was trying to change the column type, and also various means of splitting the values. Not sure what is the best way to go. I also understand I can remedy this by using individual rows for the VISITOR_ID - which I am considering. It is just that I thought it made sense to have these multiple ID's share a similar single text value entered by a user. It seemed 1 record would be better than many? not sure this is the way to go.

TV-C-1-5
  • 680
  • 2
  • 13
  • 19
  • 3
    Fix your data model! Don’t store CSV lists in database columns, instead you should have a separate table to represent the one to many relationship. – GMB May 01 '23 at 22:12

2 Answers2

2

MySQL has the FIND_IN_SET() function that kind of does what you want, but this isn't the intended use of it. It's supposed to be for searching values of MySQL's SET data type.

Effectively, it searches a comma-separated string and returns the position in the list where your value was found.

mysql> select find_in_set(777, '123,133,777') as loc;
+-----+
| loc |
+-----+
|   3 |
+-----+

It has disadvantages: it's only for searching for one value at a time, so to search multiple values you'd have to use multiple search terms:

mysql> select find_in_set(777, '123,133,777') or find_in_set(444, '123,133,777') as loc;

Also it spoils any chance of using an index. This means your queries will have terribly bad performance because they must do a table-scan.

Also it's finicky about spaces. If you have a comma-separated string with spaces like '123, 133, 777' it simply won't work the way you expect, because it'll only match if you search for a string that includes the spaces like ' 777'.

You would be better off normalizing your data properly for a one-to-many relationship (as the comment above from GMB says) and storing one value per row. Then the search is much simpler to code, and it can take advantage of an index to optimize the query.

This is one of many reasons why we discourage folks from storing comma-separated lists in strings.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • This seems like some solid advice. I intuitively know the separated values would cause problems - I just was not able to see them clearly. Forest for the trees. It just seemed like you are saving some unpacking and coding to update the text by using the multiple id's in one column - but after all this... maybe not. – TV-C-1-5 May 01 '23 at 22:54
  • 2
    Just think of SQL values as scalars. There's no such thing as a "list" — what you have is a scalar string, that happens to contain digits and commas. It's okay to store a comma-separated "list" as a string only if you plan to store and fetch the whole list as a single value. Don't try to use SQL expressions to work with the individual elements of the list. – Bill Karwin May 01 '23 at 22:57
  • So how do I handle the answer selection here? Both answers - so far - address and solve the question. But the proper answer is probably - "Don’t store CSV lists in database columns" – TV-C-1-5 May 01 '23 at 23:02
  • 1
    I can't advise you on how to reward questions. I've already gotten chastised by SO admins once for doing that. – Bill Karwin May 02 '23 at 00:29
1

This is a way to do it by splitting the comma separated column into rows using json_table then apply your condition on those generated rows :

with cte as (
  select t.VISITOR_ID, j.VISITOR
  from mytable t
  join json_table(
    CONCAT('[', t.VISITOR_ID, ']'),
    '$[*]' columns (VISITOR BIGINT(20) path '$')
) j
)
select distinct VISITOR_ID
from cte
where VISITOR IN(444,777);

Demo here

SelVazi
  • 10,028
  • 2
  • 13
  • 29