0

I have a table below, my problem is how can i update a column in mysql based on a parameter.

+---+------------+-------------+
| id|     A      |      B      |
+---+------------+-------------+
| 1 | a b a a    |             |
| 2 | b c a      |             |
| 3 | b d c      |             |
| 4 | a          |             |
+---+------------+-------------+

The expected results should count the number of occurence of "a" then update column B. as shown below. for sure i need to use regular expression to count the number of "a"

+---+------------+-------------+
|   |     A      |      B      |
+---+------------+-------------+
| 1 | a b a a    |      3      |
| 2 | b c a      |      1      |
| 3 | b d c      |      0      |
| 4 | a          |      1      |
+---+------------+-------------+
Smurf
  • 3
  • 2

3 Answers3

1

This can be done in a single SQL statement. (Unfortunately, this approach does not satisfy the requirement to use regular expressions. EDIT: see the followup below for an inelegant and inefficient approach that does make use of regular expressions.)

We can get a count of occurrences of character 'a' in column A, and set column B to the count, with a query like this:

UPDATE mytable t
   SET t.B = ( CHAR_LENGTH(t.A) - CHAR_LENGTH(REPLACE(t.A,'a','')) )

Let's unpack that a little bit. It's an UPDATE statement, there's no WHERE clause, so we're going to access and potentially update every row in mytable. (We're assigning a table alias t. That's not required, but we'll qualify our later references to columns A and B using that alias, to make it more clear to someone reading the SQL statement that those are references to columns in the table.

On the next line, we've got a straightforward SET clause, assigning a value to column B.

It's the next expression, the value we're assigning to column B, that we should unpack a little bit.

To get a count of the 'a' characters, we can use a little trick: we know that an 'a' character is exactly one character in length.

The "trick" is to use the REPLACE function, to search for all occurrences of the character 'a', and remove them (replacing them with a zero length string). And we can then compare the lengths of (number of characters in) the two strings. The difference is the number of 'a' characters in the original string.

As a demonstration of how those expressions work:

SELECT t.foo                          AS foo
     , REPLACE(t.foo,'a','')          AS foo_2
     , CHAR_LENGTH(t.foo)                  AS len
     , CHAR_LENGTH(REPLACE(t.foo,'a',''))  AS len_2
     , CHAR_LENGTH(t.foo) - CHAR_LENGTH(REPLACE(t.foo,'a','')) AS `len-len_2`
  FROM ( SELECT 'a b a a' AS foo
         UNION ALL SELECT 'b c a'
         UNION ALL SELECT 'b c d'
         UNION ALL SELECT 'a  '
       ) t

The returned from that query:

foo      foo_2      len   len_2  len-len_2  
-------  ------  ------  ------  ---------
a b a a   b           7       4          3
b c a    b c          5       4          1
b c d    b c d        5       5          0
a                     3       2          1

Note: what is being returned is basically a count of the number of characters that were removed. So if we wanted to count occurrences of a string of multiple characters, for eample: cat, we'd need to account for that.

Dividing the returned value by the number of characters in cat would be one way to do that. Or, we could replace the string cat with a string that is two characters in length, e.g. 'xx', so the difference in length would be one character per occurrence.


FOLLOWUP

The original question asked how to count the 'a' characters using regular expressions. My first thought was that it wasn't possible with MySQL REGEXP because the return from that is either NULL, 0 or 1. But thinking about it a little bit, it could be done, if we are counting up to some finite number of occurrences. It would be possible to check if a string contains at least one 'a' character, that's pretty straightforward:

'a b a a' REGEXP 'a'

That will return 1 if it matches, and 0 if it doesn't. It's also possible to check if the string contains at least two 'a' characters. That's pretty straightforward too:

'a b a a' REGEXP 'a.*a'

If we add together the result from the two expressions above, we could get a count of either 0, 1 or 2 for the number of 'a' characters.

We can repeat that same pattern, to extend this to match to 3, 4, 5, etc. 'a' characters.

It's not elegant, and we certainly don't want to find out how warm the CPU is going to get doing all these comparisons. But it does return the specified result, up to some finite maximum count. In this example, six. (Strings containing more than six 'a' characters will return a count of 6.

As a demonstration:

SELECT t.foo
     ,   (t.foo REGEXP CONCAT('.*',REPEAT('a.*',1)))
       + (t.foo REGEXP CONCAT('.*',REPEAT('a.*',2)))
       + (t.foo REGEXP CONCAT('.*',REPEAT('a.*',3)))
       + (t.foo REGEXP CONCAT('.*',REPEAT('a.*',4)))
       + (t.foo REGEXP CONCAT('.*',REPEAT('a.*',5)))
       + (t.foo REGEXP CONCAT('.*',REPEAT('a.*',6)))
       AS cnt_a
  FROM ( SELECT 'a b a a' AS foo
         UNION ALL SELECT 'b c a'
         UNION ALL SELECT 'b c d'
         UNION ALL SELECT 'a  '
       ) t
spencer7593
  • 106,611
  • 15
  • 112
  • 140
0

Option 1: Pure SQL

Replace the substring you want to count with an empty string. By comparing the lengths of the resulting string and the original string you can tell how many ocurrences there are:

update table set b = (length(a) - length(replace(a,'a',''))) / length('a')

You can replace the constant string 'a' by any string of any length.

Option 2: Use PHP and SQL

You can iterate over all rows with PHP and use the substr_count function to count the occurrences of a substring (In your case the substring would be 'a'). Then you update the value for b in that row. Assuming that the field id is your primary key:

$query = $pdo->query("select id,a from table");
while($row = $query->fetch()) {
    $b = substr_count($row['a']);
    $id = $row['id'];
    $pdo->query("update table set b = $b where id = $id");
}

Note that this approach is not very efficient.

Guillermo
  • 764
  • 6
  • 15
  • Why are we giving examples that use the deprecated mysql interface, aren't there already enough examples? Why aren't we giving examples using mysqli or PDO? And we better hope to high heaven that `id` is unique, and that's it integer type and not VARCHAR, in case it contains a value like `'' OR 1=1`. Fetching the individual rows from the table, and issuing separate update statements is very inefficient approach to solving the problem. – spencer7593 Jan 16 '15 at 02:31
  • You're right, thanks for your comment. I modified my answer. I listed the PHP option because it's a possible approach, not efficient, but as you said it works. – Guillermo Jan 16 '15 at 02:42
  • Converting to PDO is nice. Ideally, we'd like see both `$b` and `$id` supplied as bind arguments to a prepared statement; it's the pattern we're used to seeing. We still don't see in the question any guarantee that the `id` column is defined as not null (the PHP code would miss updating those rows), that it's unique (otherwise, the PHP code will be updating multiple rows, multiple times), that's it's not defined as character type and doesn't contain unsafe values (we'd need to enclose the $id value in single quotes, and properly escape the string to make it safe for inclusion in SQL text. – spencer7593 Jan 16 '15 at 02:53
  • I appreciate your exacitude, but I think you are beeing too accurate. First of all in my answer I say `Assuming that the field id is your primary key`. That implicates that `id` cannot be null. Secondly it's not about giving him a perfect working answer, it's about giving hints of how to solve the problem. Everyone has to figure out the rest by himself. – Guillermo Jan 16 '15 at 02:56
0
UPDATE Table SET B = i.b
FROM (SELECT LENGTH(A) - LENGTH(REPLACE(A, 'a', '')) as b from Table ) i
WHERE i.ID = Table.ID
  • This is a decent approach, but there's some issues with the implementation: the inline view doesn't return a column named `id`; it doesn't count the number of `a` characters, and beyond that, there's actually no need for an inline view. – spencer7593 Jan 16 '15 at 02:00
  • Very nice idea, but it does only work if the substring to count is of length `1`. – Guillermo Jan 16 '15 at 02:05