I have an sql table where a column has utf8_unicode_ci
encoding, but the table itself has latin1_swedish_ci
encoding (as reported under Row Statistics
in Structure
tab of phpMyAdmin
).
The PHP webapp that accesses the database displays Japanese text correctly, but inside phpMyAdmin
everything is mojibake. The webapp (correctly) displays the Japanese text Xで有名な
, but in phpMyAdmin
it is Xã¦ã‚™æœ‰åãª
(hex()
output is 312E2058C3A3C281C2A6C3A3E2809AE284A2C3A6C593E280B0C3A5C290C28DC3A3C281C2AA
).
The app that was used to generate the data in the table is now broken, but I need to add a few new records. How can I recreate the mojibake found in the table?
I tried to reproduce the mojibake with python:
def rev_engineer(utf8):
mojibake = utf8.encode('utf8').decode('latin1')
print(mojibake)
rev_engineer('Xで有名な')
# output: Xã¦ãæåãª
# should be: Xã¦ã‚™æœ‰åãª
This is obviously very similar, but not quite there. I then tried looping through every possible encoding listed in python's documentation, and encoding/decoding every possible combination, but that did not come up with a match, either. Any idea what I'm missing?