0

I'm trying to fill a column a.total with the number of occurrences of a.name in b.content.

       a                       b
+--------------+    +----------------------+
| name | total |    | content              |
+------+-------+    +----------------------+
| foo  |     3 |    | hello bar            |
| bar  |     4 |    | hola foofoo          |
+--------------+    | nihao foo, bye bar   |
                    | et bar se bar        |
                    +----------------------+

I tried the command below:

UPDATE a SET a.total = (SELECT COUNT(*) FROM b WHERE b.content LIKE CONCAT('%', a.name, '%') );

But I get this:

Query OK, 0 rows affected (0.16 sec)
Rows matched: 4321  Changed: 0  Warnings: 0" with my command

What's wrong in this command ? And how can I also take into account all occurrences of foo and bar in a same content row ?

DevonDahon
  • 7,460
  • 6
  • 69
  • 114

1 Answers1

1

Count the number of occurrences of a string in a VARCHAR field? shows how to count the occurrences of a substring in a column. You can then sum them to get the total across all rows.

UPDATE a
SET a.total = (
    SELECT SUM((CHAR_LENGTH(b.content) - CHAR_LENGTH(REPLACE(b.content, a.name, ''))) / CHAR_LENGTH(a.name))
    FROM b
    WHERE b.content LIKE CONCAT('%', a.name, '%'))

DEMO

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Many thanks! But for some reason, when I run this command, I get nothing unless aborting it : `ERROR 1317 (70100): Query execution was interrupted`. Is it because of mysql 8.0 version ? Or because my columns `a.name` and `b.content` contain characters which unicode is something like #E000 ? – DevonDahon Apr 04 '19 at 04:17
  • Use `CHAR_LENGTH` instead of `LENGTH`, it does the right thing with multibyte characters. – Barmar Apr 04 '19 at 04:20
  • still nothing, maybe I need to give it more execution time... (I have about ~5000 rows in `a` and ~100.000 in `b`) but I don't see the rows being filled and my cpu doesn't look overloaded. – DevonDahon Apr 04 '19 at 04:34
  • Since this can't be indexed, this will be a very expensive query with so many rows. It has to calculate `SUM((CHAR_LENGTH(b.content) - CHAR_LENGTH(REPLACE(b.content, a.name, ''))) / CHAR_LENGTH(a.name))` 500,000,000 times. – Barmar Apr 04 '19 at 04:37
  • I just added a `WHERE` clause, it might help a bit. – Barmar Apr 04 '19 at 04:39