Do I still need to run a full latin1 to UTF 8 conversion on the text that looks completely fine?
I'm swapping forum software, and the old forum database used Latin1 encoding. The new forum database uses UTF8 encoding for tables.
It looks like the importer script did a straight copy from one table to another without trying to fix any encoding issues.
I've been manually fixing the visible errors using a find-and-replace based on the conversion info listed here: http://www.i18nqa.com/debug/utf8-debug.html
The rest of the text looks fine and is completely readable.
My limited understanding is that UTF-8 is backwards compatible with ASCII and Latin1 is mostly ASCII, so it's only the edge cases that are different and need to be updated.
So do I still need to run a full latin1 to UTF 8 conversion on the text that looks completely fine?
I'd rather not because I've changed some of the BB Code tags on a number of the fields after they were stored in UTF 8, so concerned that those updates would have stuck UTF8 characters in the middle of the Latin1 characters, and trying to do a full conversion on mixed character sets will just muck things up further.