I'll assume that you have a column col
in table t
, and col
is CHARACTER SET utf8
(or utf8mb4).
SELECT ...
FROM t
WHERE HEX(col) REGEXP '^(..)*CCA3'
Will locate all lines with U+0323
(encoded as UTF-8 CCA3
).
Alas, there is no easy way to modify col
in SQL.
If you have MariaDB, then UNHEX(REGEXP_REPLACE(HEX(col), 'CCA3', ''))
will almost work. The problem is that it could be misaligned (at an odd boundary).
Another thought: REPLACE(col, UNHEX('CCA3'), '')
might work in MySQL or MariaDB. However, it could get into issues with Collations. So, perhaps this would work: REPLACE(col COLLATE utf8_bin, UNHEX('CCA3'), '')