Count the number of occurrences of a string in a VARCHAR field?

Question

I have a table like this:

TITLE	DESCRIPTION
test1	value blah blah value
test2	value test
test3	test test test
test4	valuevaluevaluevaluevalue

I am trying to figure out how to return the number of times a string occurs in each of the DESCRIPTION's.

So, if I want to count the number of times 'value' appears, the sql statement will return this:

TITLE	DESCRIPTION	COUNT
test1	value blah blah value	2
test2	value test	1
test3	test test test	0
test4	valuevaluevaluevaluevalue	5

Is there any way to do this? I do not want to use php at all, just mysql.

Responses below will get you there. However, don't forget to use [`CHAR_LENGTH()`](http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_char-length) instead of [`LENGTH()`](http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_length) if you're using multibyte characters. — inhan, Sep 10 '12 at 03:03
This thread has also been answered over [here](https://stackoverflow.com/questions/5033047/mysql-query-finding-values-in-a-comma-separated-string/47069224#47069224) — Delickate, Nov 02 '17 at 06:51
LENGTH([field]) - LENGTH(REPLACE([field], '[char_to_find]', '')) — Phoenix, Oct 17 '19 at 12:42

yannis · Accepted Answer · 2020-11-16T23:14:59.377

451

This should do the trick:

SELECT 
    title,
    description,    
    ROUND (   
        (
            LENGTH(description)
            - LENGTH( REPLACE ( description, "value", "") ) 
        ) / LENGTH("value")        
    ) AS count    
FROM <table>

edited Nov 16 '20 at 23:14

answered Sep 10 '12 at 02:56

yannis

6,215
5
39
49

69

This solution is awesome, just what I needed! But note, that LENGTH() is not multi-byte safe and you might run into strange errors. Use CHAR_LENGTH() instead:) – nico gawenda Apr 29 '13 at 23:28
doesn't work when you have words containing value in them e.g. "undervalue". The query also counts this row. – chyupa Apr 01 '16 at 12:32
2

there is no difference in using of `LENGTH()` and `CHAR_LENGTH()` while divided to same counting byte/char. @nicogawenda – MohaMad Mar 08 '17 at 22:26
3

@chyupa `undevalue` have `value` in it so it should be counted. If you only want to count complete words, maybe you need to search for ' value ' or bettter something more complicated like using regex. – PhoneixS May 18 '17 at 15:45
3

Note that you run into wrong counts when you search through text that also has words with capitalized letters (like German where all nouns are written with capitalized letter). The REPLACE only replaces exact matches. To consider all words you need to change the replace above to: `LENGTH( REPLACE ( LOWER(description), "value", "") )` and make sure that "value" is always lowercased by using PHP `strtolower()`. PS: This solution above helped me to build my own little search engine and to weight the results by the number of words within the text. Thanks! – Avatar Jul 03 '17 at 19:11
CONCAT spaces before and after `description`, and put spaces before and after `"value"` so it becomes `" value "`, and the query won't register `"undervalue"` as a hit. The concat is needed in case "value" appears at the start or end of the string. Sadly it then won't register "valuevalue" as a hit. – Slam Sep 29 '17 at 07:47
1

I'm not sure if it's related to MySQL settings, but for me `LENGTH("value")` gave error message: `Error Code: 1054 Unknown column 'value' in 'field list'`. I had to change it to single-quotes: `LENGTH('value')` – Kip Jul 10 '18 at 17:33
4

The `ROUND` here is unnecessary. assume a string of length `x` with `n` occurrences of`'value`. `LENGTH(description) - LENGTH( REPLACE ( description, "value", "") ) ` will always give you `n*length("value")`, diving that by length of value will always leave a whole number `n`. No need to round off – Nibhrit Oct 08 '18 at 12:15
In case you're counting occurences of a single character, you can use REGEXP_REPLACE in MySQL 8: `SET @var='foo, bar, baz, something'; SELECT CHAR_LENGTH(REGEXP_REPLACE(@var, '[^,]', '')); ` - count occurences of comma in @var; – Detuner Apr 26 '21 at 14:29
1

This solution worked great for me, but I stumbled upon a very strange MySQL 5.7 bug when using it for my case. If you use `COUNT()` and `REPLACE()` inside a `HAVING` clause, then MySQL will return the string _after replacement_ if you're selecting that column. For instance if you're doing `SELECT a.id, a.body FROM articles a JOIN placeholder_contents p ON p.article_id = a.id GROUP BY a.id HAVING COUNT(p.id) != ROUND((LENGTH(a.body) - LENGTH( REPLACE (a.body, "[placeholder]", ""))) / LENGTH(@needle))`, then `a.body` in the results will be devoid of "[placeholder]" occurrences. Fixed in MySQL 8. – TanguyP Jan 28 '22 at 16:34
1

I'm implementing this as a part of a view I'm creating using SQLite3 and your solution works like a charm, thank you. – ftani Sep 14 '22 at 11:10

gaborsch · Answer 2 · 2018-11-26T14:15:38.107

A little bit simpler and more effective variation of @yannis solution:

SELECT 
    title,
    description,    
    CHAR_LENGTH(description) - CHAR_LENGTH( REPLACE ( description, 'value', '1234') ) 
        AS `count`    
FROM <table>

The difference is that I replace the "value" string with a 1-char shorter string ("1234" in this case). This way you don't need to divide and round to get an integer value.

Generalized version (works for every needle string):

SET @needle = 'value';
SELECT 
    description,    
    CHAR_LENGTH(description) - CHAR_LENGTH(REPLACE(description, @needle, SPACE(LENGTH(@needle)-1))) 
        AS `count`    
FROM <table>

+1 for the idea, though I generally prefer obvious implementations, i.e. that do not require additional explanation, even if they look less elegant. — not2savvy, Apr 08 '20 at 16:16

Joe G Joseph · Answer 3 · 2012-09-10T03:08:42.547

26

try this:

 select TITLE,
        (length(DESCRIPTION )-length(replace(DESCRIPTION ,'value','')))/5 as COUNT 
  FROM <table>

SQL Fiddle Demo

edited Sep 10 '12 at 03:08

answered Sep 10 '12 at 02:58

Joe G Joseph

23,518
5
56
58

3

length is not binary safe, use char_length() – luky Jun 02 '18 at 17:45

score 15 · Answer 4 · edited May 02 '17 at 11:25

In SQL SERVER, this is the answer

Declare @t table(TITLE VARCHAR(100), DESCRIPTION VARCHAR(100))

INSERT INTO @t SELECT 'test1', 'value blah blah value' 
INSERT INTO @t SELECT 'test2','value test' 
INSERT INTO @t SELECT 'test3','test test test' 
INSERT INTO @t SELECT 'test4','valuevaluevaluevaluevalue' 


SELECT TITLE,DESCRIPTION,Count = (LEN(DESCRIPTION) - LEN(REPLACE(DESCRIPTION, 'value', '')))/LEN('value') 

FROM @t

Result

TITLE   DESCRIPTION               Count
test1   value blah blah value        2
test2   value test                   1
test3   test test test               0
test4   valuevaluevaluevaluevalue    5

I don't have MySQL install, but goggled to find the Equivalent of LEN is LENGTH while REPLACE is same.

So the equivalent query in MySql should be

SELECT TITLE,DESCRIPTION, (LENGTH(DESCRIPTION) - LENGTH(REPLACE(DESCRIPTION, 'value', '')))/LENGTH('value') AS Count
FROM <yourTable>

Please let me know if it worked for you in MySql also.

score 6 · Answer 5 · answered May 03 '16 at 08:09

Here is a function that will do that.

CREATE FUNCTION count_str(haystack TEXT, needle VARCHAR(32))
  RETURNS INTEGER DETERMINISTIC
  BEGIN
    RETURN ROUND((CHAR_LENGTH(haystack) - CHAR_LENGTH(REPLACE(haystack, needle, ""))) / CHAR_LENGTH(needle));
  END;

score 3 · Answer 6 · edited Dec 02 '22 at 18:31

3

This is the mysql function using the space technique (tested with mysql 5.0 + 5.5):

CREATE FUNCTION count_str( haystack TEXT,  needle VARCHAR(32))
  RETURNS INTEGER DETERMINISTIC
  RETURN LENGTH(haystack) - LENGTH( REPLACE ( haystack, needle, space(char_length(needle)-1)) );

edited Dec 02 '22 at 18:31

Giacomo1968

25,759
11
71
103

answered Nov 23 '18 at 12:17

jfx

355
1
7

wrong for utf-8 characters count_str('BÆS', 'Æ') = 2, char_length version above is correct – luky May 19 '22 at 10:38

score 1 · Answer 7 · answered May 09 '15 at 10:55

SELECT 
id,
jsondata,    
ROUND (   
    (
        LENGTH(jsondata)
        - LENGTH( REPLACE ( jsondata, "sonal", "") ) 
    ) / LENGTH("sonal")        
)
+
ROUND (   
    (
        LENGTH(jsondata)
        - LENGTH( REPLACE ( jsondata, "khunt", "") ) 
    ) / LENGTH("khunt")        
)
AS count1    FROM test ORDER BY count1 DESC LIMIT 0, 2

Thanks Yannis, your solution worked for me and here I'm sharing same solution for multiple keywords with order and limit.

Duc Toan Pham · Answer 8 · 2022-04-05T01:59:57.150

1

In most cases, these functions are LENGTH and REPLACE, respectively (SQL Server users will use the built-in function LEN rather than LENGTH):

Example, count num of comma in the string "10,CLARK,MANAGER"

select (length('10,CLARK,MANAGER')-
        length(replace('10,CLARK,MANAGER',',','')))/length(',')
as cnt from t1

edited Apr 05 '22 at 01:59

answered Apr 03 '22 at 03:25

Duc Toan Pham

474
6
6

Count the number of occurrences of a string in a VARCHAR field?

8 Answers8

SQL Fiddle Demo

Linked

Related