So I wanted to update a table where I have to change the contents of a specific column. For the updated value I need to use RegEx. Fortunately my DB is MariaDB so it has REGEXP_REPLACE support.
Now, I wrote a Python script that should do this: Take a list of strings (say, words) and if a column value starts with one of these strings, delete it. The SQL query should be basically this:
REGEXP_REPLACE(column, '(?i)^(%s\\s+)', '');
i.e. I'm doing a case-insensitive match where a specific string (specified by %s
) followed by any number of spaces occurs in the beginning of the column value.
Now when I try to put this into the script, I'm facing several problems, depending on how I inject the parameter to the query.
Using String interpolation I know this is not the preferred way to do it, but here's how it looks like:
sql = "UPDATE `table` SET `column` = REGEXP_REPLACE(`column`, '(?i)^(%s\\\\s+)', '');" % (term) cursor.execute(sql)
When I use this, the
term
gets inserted correctly (I check withprint(cursor.mogrify(sql, (term,)))
), but I run into problems ifterm
contains any characters that would have to be escaped in the regex, e.g.-
.Using execute with parameters This is the cleaner option, but I get another problem here - the string gets inserted including quotes. Which breaks the whole statement because it also has quotes in it. So here's the code:
sql = "UPDATE `table` SET `column` = REGEXP_REPLACE(`column`, '(?i)^(%s\\\\s+)', '');" cursor.execute(sql, (term,))
But this results in statements like this:
UPDATE `table` SET `column` = REGEXP_REPLACE(`column`, '(?i)^('example'\\s+)', '');
with a quoted string inside the regex which is also in quotes.
Any suggestions how I can solve this? I need to have the replacement terms inside the regex without quotes, but also if there are any special characters inside the replacement terms, they have to be escaped. Maybe I need to write a separate function for this but I'm not sure. I'm neither very advanced in Python nor in SQL.
Thanks in advance for any hints.