0

I need to change the charset on my table to support multi-language it was set up in.

latin1_swedish_ci

And I need to change it to.

utf8mb4_unicode_ci 

This is easy to do by running this query.

alter table articles convert to character set utf8mb4 collate utf8mb4_unicode_ci;

But I have been running some tests on my local version exporting the content before the conversion and then after the conversion and then running a diff check on both files.

diff articles_before.sql articles_after_charset_change.sql

I expected the data not to really change but i have a load of diffs mainly backslashes.

for instance

latin1_swedish_ci title.

Coral Beach Resort ??????, Women Dive\'s Day 2017 _ SONY

utf8mb4_unicode_ci title

Coral Beach Resort ??????, Women Dive''s Day 2017 _ SONY

Another example latin1_swedish_ci title.

Studios\\\' Black Panther

utf8mb4_unicode_ci title

Studios\\'' Black Panther

The issues are mainly backslashes and quotes.

Is there any way I can change the charset for future data but not alter any of the current data?

Thanks

user1503606
  • 3,872
  • 13
  • 44
  • 78
  • You probably used different sqlserver versions to create your backups. A single quote (`'`) can be escaped both as `\'` and as `''` (see e.g. [here](https://stackoverflow.com/q/9596652/6248528)), so they will be the same when you read the backup back in. MySQL switched from using `\'` to `''` to be more compliant with the sql standard. – Solarflare Mar 07 '18 at 18:14
  • @Solarflare thanks for this you are indeed correct after exporting from the same setup all the diffs went away ;) – user1503606 Mar 07 '18 at 22:52

0 Answers0