The data is the names of country subdivisions. Some have been stored as utf8 and some are not. example, this is how they are in my table:
statename
Bocas del Toro
ChiriquÃ
Coclé
Colón
Darién
Veraguas
Panamá Oeste
Emberá
Kuna Yala
Ngöbe-Buglé
This question/answer gets me really close to a solution: How to fix double-encoded UTF8 characters (in an utf-8 table)
If I use:
CONVERT(CAST(CONVERT(statename USING latin1) AS BINARY) USING utf8)
:
statename
Bocas del Toro
Chiriquí
Coclé
Col
Dari
Veraguas
Panam
Emberá
Kuna Yala
Ng
the characters stored as "é" for example, just end the string.
the variation provided in that answer ,
SELECT CASE
WHEN CONVERT( CAST( CONVERT( statename USING latin1 ) AS BINARY ) USING utf8 ) IS NULL
THEN statename
ELSE CONVERT( CAST( CONVERT( statename USING latin1 ) AS BINARY ) USING utf8 )
END
FROM
returned the same result, though I am not even sure i implemented it correctly in this select.
I am not permitted to normalize this data in this case, so I would like to select it and get
Bocas del Toro
Chiriquí
Coclé
Colón
Darién
Veraguas
Panamá Oeste
Emberá
Kuna Yala
Ngöbe-Buglé
Will this be possible?