0

In a MySQL table I have a VARCHAR column called ShareID.

  • If the ShareID value for Row #1 contains a string in the form of 1
  • and the ShareID value for Row #2 contains a string in the form of 10, 1
  • and the ShareID value for Row #3 contains a string in the form of 111, 12.

I would like to grab all the rows where the ShareID is 1. i.e. ONLY the first and second rows here.

I have tried using the LIKE command, like so:

SELECT * FROM tablename WHERE ShareWithID LIKE '1%';

but this will catch ALL the rows that contain the number 1 in it, i.e. Row #3 which is not what I want.

I would like to run a command that would ONLY return rows #1 and #2 above because they have a ShareID of 1 contained within it.

I've tried a variety of commands, (including REGEXP, and IN) and managed a 'frig' solution where I'd place a comma after EVERY number in the ShareID column, including the last one (i.e. 10, 1,), and then execute this command:

SELECT * FROM tablename WHERE ShareWithID LIKE '%1,%';

But I would rather use a proper solution over a frigged solution.

Any guidance would be most welcome.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
Alan N
  • 181
  • 2
  • 8

3 Answers3

2

You should not be storing lists of numbers in a comma-delimited string. This is a really bad idea:

  • Number should be stored as numbers, not strings.
  • Your numbers appear to be ids. Ids should have explicit foreign keys defined.
  • SQL -- in general -- has lousy string handling functions.
  • SQL cannot optimize the queries with string operations.
  • SQL has a great way of storing lists. It is called a table.

Sometimes, though, we are stuck with other peoples really, really, really, really bad decisions on designing databases. MySQL has a convenient function for this situation:

where find_in_set(1, ShareWithID) > 0

If you have spaces in the string, you need to remove them:

where find_in_set(1, replace(ShareWithID, ' ', '')) > 0
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Yes, I'm aware of MySQL's lack of string handling abilities, but sometimes string manipulation is the better way to go. It saves having to waste time building a 250,000 row 'table' (was it??) to look after a few columns of IDs, when one column in the original 'table' could do the job just as well. So long as the correct 'built-in' MySQL command is used to manipulate the string. So thank you for leading me to the 'find_in_set' command, that I was unaware of. That's why portals like stackoverflow are so good, however, there is one fundamental detail that needs to be addressed - explained below. – Alan N Feb 02 '19 at 15:55
1

...the built-in feature is there to be used

FIND_IN_SET() is actually not intended to be used for strings containing comma-separated lists. It's intended to be used with MySQL's SET data type. Hence the name FIND_IN_SET(), not FIND_IN_COMMA_SEPARATED_LIST().

It saves having to waste time building a 250,000 row 'table' (was it??) to look after a few columns of IDs, when one column in the original 'table' could do the job just as well.

250k rows is not a problem for MySQL. I manage databases with billions of rows in a given table. If you do basic query optimization with indexes, most queries on a table of 250k rows are just fine.

Whereas using a comma-separated list, you spoil any chance of optimizing queries. An index does not help searching for substrings that may not be the leftmost prefix of the string, and searching for a number in a comma-separated list is basically searching for a substring.

You're making your queries impossible to optimize by using a comma-separated list. Every query using FIND_IN_SET() will be a table-scan, which will get slower in a linear relationship to the number of rows in your table.

There are other disadvantages to using a comma-separated list besides indexing, which I wrote about in my answer to this old post: Is storing a delimited list in a database column really that bad?

I would rather use a proper solution over a frigged solution.

Then store one id per row. In a relational database, that's the proper solution.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Thanks for the explanation Bill and the link to your old post. I do appreciate that storing numbers in a string is generally a big no-no, but in my case it works perfectly for the limited amount of I/O that I intend to use it for. I've never used Indexes with my databases, simply because I don't know how to yet. It seems that the correct way is if all 250,000 rows need associative columns for the SharedWithID values, I'll need a new 250,000 row table with the relevant amount of columns. I now know two ways to crack this nut, although one appears more professionally accepted then the other. – Alan N Feb 03 '19 at 03:02
  • 1
    You might like my presentation [How to Design Indexes, Really](https://www.slideshare.net/billkarwin/how-to-design-indexes-really), or the [video](https://www.youtube.com/watch?v=ELR7-RdU9XU). – Bill Karwin Feb 03 '19 at 03:12
  • Thanks for those presentation links Bill. They are in my MySQL tutorial folder. Also, I have decided to veer away from the 'string' method of storage and gone down the 'integer' route. I'll dynamically add extra columns to the main table as and when I need them. There shouldn't be more than 5 additional columns. However, if that number rises substantially, I'll create the new table. – Alan N Feb 03 '19 at 15:46
  • So when you need to find a specific number, you'll have to search 5 columns, because you don't know which column the number you search for is stored in. This means you still can't use an index. I'm telling you this is a **solved problem** — create a child table and store one id per row in **one column.** Then you can index that one column and search for any value efficiently. – Bill Karwin Feb 03 '19 at 16:16
  • I'm reminded of the quote attributed to Winston Churchill: "You can always count on Americans to do the right thing – after they’ve tried everything else." – Bill Karwin Feb 03 '19 at 16:18
  • Winston Churchill said many stupid things in his time. I have no intention of using indexes to solve this 'particular' problem. It's just good to know that they exist. The problem I had was counting how many records from a large table are linked to 'more' than one ShareWithID. I wanted the best solution and that's why I placed the question on this board, because I didn't know the answer! I've used the 'search multiple columns' method and have written and tested the code and placed it in a Stored Procedure. It works exactly as I want it. Problem solved. Cheers! – Alan N Feb 03 '19 at 17:33
  • Okay, I apologize if I'm annoying you. I hope this exchange won't make you resistant to considering normalizing your database in the future. Best wishes. – Bill Karwin Feb 03 '19 at 17:35
0

The solution to this problem is to use Gordon Linoff's suggestion of the FIND_IN_SET command in conjunction with the correct configuration of the table column in question, like this:

SELECT * FROM tablename WHERE FIND_IN_SET('1', ShareWithID);

However, because the FIND_IN_SET command allows you to find the position of a string within a comma-separated list of strings, you MUST ensure that the contents of the column contains a comma after each item and DOES NOT contain spaces after the comma.

So this column content used in conjunction with the above command will return '0' rows: 111, 1

While this column content will return '1' row: 111,1

As will this one: 33,1

And this one: 44,1,415

Alan N
  • 181
  • 2
  • 8