38

So my question is pretty simple:

I have a column in SQL which is a comma separated list (ie cats,dogs,cows,) I need to count the number of items in it using only sql (so whatever my function is (lets call it fx for now) would work like this:

 SELECT fx(fooCommaDelimColumn) AS listCount FROM table WHERE id=...

I know that that is flawed, but you get the idea (BTW if the value of fooCommaDelimColumn is cats,dogs,cows,, then listCount should return 4...).

That is all.

BenMorel
  • 34,448
  • 50
  • 182
  • 322
Tomas Reimers
  • 3,234
  • 4
  • 24
  • 37
  • This thread has also been answered over [here](https://stackoverflow.com/questions/5033047/mysql-query-finding-values-in-a-comma-separated-string/47069224#47069224) – Delickate Nov 02 '17 at 06:53

6 Answers6

92

There is no built-in function that counts occurences of substring in a string, but you can calculate the difference between the original string, and the same string without commas:

LENGTH(fooCommaDelimColumn) - LENGTH(REPLACE(fooCommaDelimColumn, ',', ''))

It was edited multiple times over the course of almost 8 years now (wow!), so for sake of clarity: the query above does not need a + 1, because OPs data has an extra trailing comma.

While indeed, in general case for the string that looks like this: foo,bar,baz the correct expression would be

LENGTH(col) - LENGTH(REPLACE(col, ',', '')) + 1
zerkms
  • 249,484
  • 69
  • 436
  • 539
  • 2
    Although this works, it may be good to say that Tomas would not have this problem if he had correctly modeled is database. – Vincent Savard Aug 11 '11 at 02:10
  • 6
    Actually you should add 1 to your answer in case there is only one item in the list : LENGTH(fooCommaDelimColumn) - LENGTH(REPLACE(fooCommaDelimColumn, ',', '')) + 1. This still deserves a +1 for the principle expressed. – RolandoMySQLDBA Aug 11 '11 at 02:10
  • @RolandoMySQLDBA: I thought the same thing, but Tomas's string included a trailing comma, so it is not needed. – Vincent Savard Aug 11 '11 at 02:11
  • +1+♡ awesome. out of curiosity, would there be a way to get the count right without knowing ahead of time whether each line has a trailing comma? i know, not my question, but i just curious. :) – shelleybutterfly Aug 11 '11 at 02:12
  • @Vincent, how would I correctly model my database? I'm a SQL noob so I am here to learn – Tomas Reimers Aug 11 '11 at 02:12
  • @shelleybutterfly: uhm, you can trim trailing comma and do the same calculations – zerkms Aug 11 '11 at 02:14
  • @Tomas : you should make a separate table to store your animals, having at least two columns : the id referencing the row in your first table, and the value (cat). – Vincent Savard Aug 11 '11 at 02:15
  • Can anyone here suggest a good book/website/general reading for learning how to model databases correctly? – Tomas Reimers Aug 11 '11 at 02:15
  • re: modeling the DB; of course you don't always have that luxury. :) once i had to deal with this (hey, it was customer data, i had no control :P) and we ended up writing a function to extract the items from the list that we then somehow rolled up into a stored procedure. (was SQL Server.) it was a little messy getting it to work but once it was going it worked like a charm. i wouldn't necessarily want to try to scale that DB though. :) – shelleybutterfly Aug 11 '11 at 02:16
  • @zerkms um, wait... the question was predicated on not knowing in advance whether there even is a trailing comma, so we can't just chop off the last character... am i missing something obvious? – shelleybutterfly Aug 11 '11 at 02:18
  • @shelleybutterfly: I can't get you. Even if we don't know if there is a trailing comma, but we need to strip it out - we can trim it http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_trim – zerkms Aug 11 '11 at 02:23
  • Tomas: I only know french resources, sorry. I submitted a new answer for your problem, though. @shelleybutterfly: Of course, if you can't, you have to deal with it, but it is better to correct the mistakes creating the problems than to use some band-aid solutions. – Vincent Savard Aug 11 '11 at 02:28
  • @Tomas looks like wikipedia has a decent intro at least: http://en.wikipedia.org/wiki/Database_model – shelleybutterfly Aug 11 '11 at 02:30
  • @zerkms hmm. i need to study up a bit i see. (a lot.) so, yes, i see, trim, then subtract, then add 1. :) – shelleybutterfly Aug 11 '11 at 02:37
  • @vincent yes, absolutely agree, due to past experience i just assumed legacy data and didn't consider that it might be a new DB. – shelleybutterfly Aug 11 '11 at 02:41
  • 1
    @Alvaro Joao: you're right it should have `+ 1` in general, but in this particular case the OPs list has an extra trailing comma. – zerkms Mar 27 '19 at 19:46
8

zerkms' solution works, no doubt about that. But your problem is created by an incorrect database schema, as Steve Wellens pointed out. You should not have more than one value in one column because it breaks the first normal law. Instead, you should make at least two tables. For instance, let's say that you have members who own animals :

table member (member_id, member_name)
table member_animal (member_id, animal_name)

Even better: since many users can have the same type of animal, you should create 3 tables :

table member (member_id, member_name)
table animal (animal_id, animal_name)
table member_animal (member_id, animal_id)

You could populate your tables like this, for instance :

member (1, 'Tomas')
member (2, 'Vincent')
animal (1, 'cat')
animal (2, 'dog')
animal (3, 'turtle')
member_animal (1, 1)
member_animal (1, 3)
member_animal (2, 2)
member_animal (2, 3)

And, to answer your initial question, this is what you would do if you wanted to know how many animals each user has :

SELECT member_id, COUNT(*) AS num_animals
FROM member
INNER JOIN member_animal
    USING (member_id)
INNER JOIN animal
    USING (animal_id)
GROUP BY member_id;
Vincent Savard
  • 34,979
  • 10
  • 68
  • 73
  • Indeed this answer is more *correct*. Here at SO we are to help each other to implement in a correct way, to solve the root of issues, +1 – zerkms Aug 11 '11 at 02:30
  • Thanks for the help, I'd still love to get my hands on some learning resources though (as you can see my sql knowledge ends with UPDATE :) ) – Tomas Reimers Aug 11 '11 at 02:42
  • 1
    still there are situations in database design, where common rules of normalization are not priority. e.g. when speed of joins in huge tables is the most important factor, it's legimitate to allow exeptions and it is absolutely ok and best practice to put csv in varchar fields. so the values are retrieved in lighning speed and complicated joins can be avoided (as seen in vincents answer). just wanted to point this out. – Raffael Meier Apr 10 '18 at 17:00
7

Following the suggestion from @zerkms.

If you dont know if there is a trailing comma or not, use the TRIM function to remove any trailing commas:

(
    LENGTH(TRIM(BOTH ',' FROM fooCommaDelimColumn))
  - LENGTH(REPLACE(TRIM(BOTH ',' FROM fooCommaDelimColumn), ',', ''))
  + 1
) as count

Reference: http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_trim

I also agree that a refactoring of the tables is the best option, but if this is not possible now, this snippet can do the work.

tauanz
  • 291
  • 3
  • 4
  • This is the all-round solution. Sometimes it is required to store data (array) in JSON-encoded form (e.g. `["ABC","DEF","GHI","JKL"]`) and yet to count the items inside. For this purpose your solution works even without *trimming*. – shadyyx Oct 26 '15 at 15:22
3

This version doesn't support leading or trailing commas, but supports an empty value with a count of 0:

IF(values, LENGTH(values) - LENGTH(REPLACE(values, ',', '')) + 1, 0) AS values_count
arlomedia
  • 8,534
  • 5
  • 60
  • 108
1

If we do +1 and if we have an empty column it always comes as 1 to make it 0 we can use IF condition in mySQL.

IF(LENGTH(column_name) > 0, LENGTH(column_name) - LENGTH(REPLACE(column_name, ',', '')) + 1, 0)
1

The answer is to correct the database schema. It sounds like a many-to-many relationship which requires a junction table. http://en.wikipedia.org/wiki/Junction_table

Steve Wellens
  • 20,506
  • 2
  • 28
  • 69