I have a server running MySQL (tomcat, java spring) and recently upgraded the entire database to utf8mb4.
However, I'm now noticing characters we've likely copied from Word are causing issues - like "smart quotes" in the data are showing as ’
in the app's UI. Similar issue seen here.
Is there a function I can create in MySQL to just replace these Windows 1252-specific characters with UTF-8 equivalents moving forward? Or another solution to do this dynamically?
The goal is to avoid the issue for all users who try to save information copied from MS Word and other similar sources in the future. Such data may come from editing in the UI, bulk imports, etc. so I'd like the adjustment in the back-end to apply to all of these use cases where data is inserted into the db.
--
To fix the immediate issue, I was able to run a couple of queries to change the left and right smart quotes similar to this: (source)
UPDATE artist SET artist_blurb = REPLACE(artist_blurb, ”’, ”’)
WHERE artist_blurb LIKE ‘%’%’;
SHOW VARIABLES LIKE 'char%'; result is: | Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |