0
  • I make a request which responds in json utf-8 (dict with tupels).
  • I'd like to insert the json-response into a mysql database.
  • I encode the string with string.encode('utf-8') but then the Database, which encodes also in UTF-8 wont insert the correct value.

Without string.encode('utf-8') i will get an UnicodeEncodeError, so it seems this necessary to encode the string

UnicodeEncodeError: 'ascii' codec can't encode character '\xfc' in position 147: ordinal not in range(128)

I make an example:

  • string in json: 'quälend'
  • string after encode: 'qu\xC3\xA4lend'
  • inserted in database: 'quälend'

my guess is, that database handles the encoded utf-8 string still as latin-1, but when I insert some sample utf-8 string which is not encoded, then it works. So the database can handle it right, but just not with the encoded string.

Do you have an idea what the problem might be? I'm loosing my mind about this issue since hours. Any hint is highly appreciated.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
endo.anaconda
  • 2,449
  • 4
  • 29
  • 55

1 Answers1

1

Do not use any encode/decode functions; that only makes things worse.

ä is correctly C3A4 in UTF-8 (MySQL's CHARACTER SET utf8 or utf8mb4).

ä is "Mojibake". Search http://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored for what causes that and what you needed to do to avoid it.

It is unclear where "ascii" and "latin-1" got involved; avoid them. FC is latin1 for ü.

More python notes: http://mysql.rjweb.org/doc.php/charcoll#python

Rick James
  • 135,179
  • 13
  • 127
  • 222