237

I have a table like this:

TITLE DESCRIPTION
test1 value blah blah value
test2 value test
test3 test test test
test4 valuevaluevaluevaluevalue

I am trying to figure out how to return the number of times a string occurs in each of the DESCRIPTION's.

So, if I want to count the number of times 'value' appears, the sql statement will return this:

TITLE DESCRIPTION COUNT
test1 value blah blah value 2
test2 value test 1
test3 test test test 0
test4 valuevaluevaluevaluevalue 5

Is there any way to do this? I do not want to use php at all, just mysql.

Anonymous
  • 835
  • 1
  • 5
  • 21
Geekman
  • 2,513
  • 3
  • 16
  • 10
  • 6
    Responses below will get you there. However, don't forget to use [`CHAR_LENGTH()`](http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_char-length) instead of [`LENGTH()`](http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_length) if you're using multibyte characters. – inhan Sep 10 '12 at 03:03
  • This thread has also been answered over [here](https://stackoverflow.com/questions/5033047/mysql-query-finding-values-in-a-comma-separated-string/47069224#47069224) – Delickate Nov 02 '17 at 06:51
  • Hi, how do i do this with sqlserver query? – aintno12u Feb 12 '19 at 02:41
  • LENGTH([field]) - LENGTH(REPLACE([field], '[char_to_find]', '')) – Phoenix Oct 17 '19 at 12:42

8 Answers8

451

This should do the trick:

SELECT 
    title,
    description,    
    ROUND (   
        (
            LENGTH(description)
            - LENGTH( REPLACE ( description, "value", "") ) 
        ) / LENGTH("value")        
    ) AS count    
FROM <table> 
yannis
  • 6,215
  • 5
  • 39
  • 49
  • 69
    This solution is awesome, just what I needed! But note, that LENGTH() is not multi-byte safe and you might run into strange errors. Use CHAR_LENGTH() instead:) – nico gawenda Apr 29 '13 at 23:28
  • doesn't work when you have words containing value in them e.g. "undervalue". The query also counts this row. – chyupa Apr 01 '16 at 12:32
  • 2
    there is no difference in using of `LENGTH()` and `CHAR_LENGTH()` while divided to same counting byte/char. @nicogawenda – MohaMad Mar 08 '17 at 22:26
  • 3
    @chyupa `undevalue` have `value` in it so it should be counted. If you only want to count complete words, maybe you need to search for ' value ' or bettter something more complicated like using regex. – PhoneixS May 18 '17 at 15:45
  • 3
    Note that you run into wrong counts when you search through text that also has words with capitalized letters (like German where all nouns are written with capitalized letter). The REPLACE only replaces exact matches. To consider all words you need to change the replace above to: `LENGTH( REPLACE ( LOWER(description), "value", "") )` and make sure that "value" is always lowercased by using PHP `strtolower()`. PS: This solution above helped me to build my own little search engine and to weight the results by the number of words within the text. Thanks! – Avatar Jul 03 '17 at 19:11
  • CONCAT spaces before and after `description`, and put spaces before and after `"value"` so it becomes `" value "`, and the query won't register `"undervalue"` as a hit. The concat is needed in case "value" appears at the start or end of the string. Sadly it then won't register "valuevalue" as a hit. – Slam Sep 29 '17 at 07:47
  • 1
    I'm not sure if it's related to MySQL settings, but for me `LENGTH("value")` gave error message: `Error Code: 1054 Unknown column 'value' in 'field list'`. I had to change it to single-quotes: `LENGTH('value')` – Kip Jul 10 '18 at 17:33
  • 4
    The `ROUND` here is unnecessary. assume a string of length `x` with `n` occurrences of`'value`. `LENGTH(description) - LENGTH( REPLACE ( description, "value", "") ) ` will always give you `n*length("value")`, diving that by length of value will always leave a whole number `n`. No need to round off – Nibhrit Oct 08 '18 at 12:15
  • In case you're counting occurences of a single character, you can use REGEXP_REPLACE in MySQL 8: `SET @var='foo, bar, baz, something'; SELECT CHAR_LENGTH(REGEXP_REPLACE(@var, '[^,]', '')); ` - count occurences of comma in @var; – Detuner Apr 26 '21 at 14:29
  • 1
    This solution worked great for me, but I stumbled upon a very strange MySQL 5.7 bug when using it for my case. If you use `COUNT()` and `REPLACE()` inside a `HAVING` clause, then MySQL will return the string _after replacement_ if you're selecting that column. For instance if you're doing `SELECT a.id, a.body FROM articles a JOIN placeholder_contents p ON p.article_id = a.id GROUP BY a.id HAVING COUNT(p.id) != ROUND((LENGTH(a.body) - LENGTH( REPLACE (a.body, "[placeholder]", ""))) / LENGTH(@needle))`, then `a.body` in the results will be devoid of "[placeholder]" occurrences. Fixed in MySQL 8. – TanguyP Jan 28 '22 at 16:34
  • 1
    I'm implementing this as a part of a view I'm creating using SQLite3 and your solution works like a charm, thank you. – ftani Sep 14 '22 at 11:10
53

A little bit simpler and more effective variation of @yannis solution:

SELECT 
    title,
    description,    
    CHAR_LENGTH(description) - CHAR_LENGTH( REPLACE ( description, 'value', '1234') ) 
        AS `count`    
FROM <table> 

The difference is that I replace the "value" string with a 1-char shorter string ("1234" in this case). This way you don't need to divide and round to get an integer value.

Generalized version (works for every needle string):

SET @needle = 'value';
SELECT 
    description,    
    CHAR_LENGTH(description) - CHAR_LENGTH(REPLACE(description, @needle, SPACE(LENGTH(@needle)-1))) 
        AS `count`    
FROM <table> 
gaborsch
  • 15,408
  • 6
  • 37
  • 48
  • 2
    +1 for the idea, though I generally prefer obvious implementations, i.e. that do not require additional explanation, even if they look less elegant. – not2savvy Apr 08 '20 at 16:16
26

try this:

 select TITLE,
        (length(DESCRIPTION )-length(replace(DESCRIPTION ,'value','')))/5 as COUNT 
  FROM <table> 


SQL Fiddle Demo

Joe G Joseph
  • 23,518
  • 5
  • 56
  • 58
15

In SQL SERVER, this is the answer

Declare @t table(TITLE VARCHAR(100), DESCRIPTION VARCHAR(100))

INSERT INTO @t SELECT 'test1', 'value blah blah value' 
INSERT INTO @t SELECT 'test2','value test' 
INSERT INTO @t SELECT 'test3','test test test' 
INSERT INTO @t SELECT 'test4','valuevaluevaluevaluevalue' 


SELECT TITLE,DESCRIPTION,Count = (LEN(DESCRIPTION) - LEN(REPLACE(DESCRIPTION, 'value', '')))/LEN('value') 

FROM @t

Result

TITLE   DESCRIPTION               Count
test1   value blah blah value        2
test2   value test                   1
test3   test test test               0
test4   valuevaluevaluevaluevalue    5

I don't have MySQL install, but goggled to find the Equivalent of LEN is LENGTH while REPLACE is same.

So the equivalent query in MySql should be

SELECT TITLE,DESCRIPTION, (LENGTH(DESCRIPTION) - LENGTH(REPLACE(DESCRIPTION, 'value', '')))/LENGTH('value') AS Count
FROM <yourTable>

Please let me know if it worked for you in MySql also.

gaborsch
  • 15,408
  • 6
  • 37
  • 48
Niladri Biswas
  • 4,153
  • 2
  • 17
  • 24
6

Here is a function that will do that.

CREATE FUNCTION count_str(haystack TEXT, needle VARCHAR(32))
  RETURNS INTEGER DETERMINISTIC
  BEGIN
    RETURN ROUND((CHAR_LENGTH(haystack) - CHAR_LENGTH(REPLACE(haystack, needle, ""))) / CHAR_LENGTH(needle));
  END;
michaelbn
  • 7,393
  • 3
  • 33
  • 46
3

This is the mysql function using the space technique (tested with mysql 5.0 + 5.5):

CREATE FUNCTION count_str( haystack TEXT,  needle VARCHAR(32))
  RETURNS INTEGER DETERMINISTIC
  RETURN LENGTH(haystack) - LENGTH( REPLACE ( haystack, needle, space(char_length(needle)-1)) );
Giacomo1968
  • 25,759
  • 11
  • 71
  • 103
jfx
  • 355
  • 1
  • 7
  • wrong for utf-8 characters count_str('BÆS', 'Æ') = 2, char_length version above is correct – luky May 19 '22 at 10:38
1
SELECT 
id,
jsondata,    
ROUND (   
    (
        LENGTH(jsondata)
        - LENGTH( REPLACE ( jsondata, "sonal", "") ) 
    ) / LENGTH("sonal")        
)
+
ROUND (   
    (
        LENGTH(jsondata)
        - LENGTH( REPLACE ( jsondata, "khunt", "") ) 
    ) / LENGTH("khunt")        
)
AS count1    FROM test ORDER BY count1 DESC LIMIT 0, 2

Thanks Yannis, your solution worked for me and here I'm sharing same solution for multiple keywords with order and limit.

1

In most cases, these functions are LENGTH and REPLACE, respectively (SQL Server users will use the built-in function LEN rather than LENGTH):

Example, count num of comma in the string "10,CLARK,MANAGER"

select (length('10,CLARK,MANAGER')-
        length(replace('10,CLARK,MANAGER',',','')))/length(',')
as cnt from t1
Duc Toan Pham
  • 474
  • 6
  • 6