1

I am writing strings to our 5.6 MySQL server and am getting all sorts of weird behavior. Things start off working fine but eventually I get encoding errors and SwlException errors.

The table and session wncoding is all set to utf8mb4 and I can copy and paste emojis just fine in SequelPro. However unless I add

useUnicode=True&characterEncoding=utf8

It converts all the emojis to ??? but then when I add those to the jdbcurl it turns all my emojis into symbols like ¥ which leads me to believe it’s an encoding issue.

for example Like these: ðŸ…🥬ðŸ¥'ðŸ¥'

This used to work! Then one day it just started writing those weird symbols instead of emojis. So I had to start messing with session variables.

Weirdly if I set the session variables to utf8 in the jdbcurl it works properly writing emojis like you expect to the database

&sessionVariables=character_set_client=utf8,character_set_connection=utf8,character_set_results=utf8

Which doesn’t make much sense but it’s all I could get to work.

But eventually after an hour or two of writing stuff I will get a string exception like

java.sql.SQLException: Incorrect string value: '\xF0\x9F\x91\xBD\xF0\x9F...'

I have verified that there are working by inserting debug statements into kotlin. It’s an issue with jdbc and the database.

Anybody run into this before?

All of it works fine in a preciously written piece of php code but that turns the emojis into Unicode escape sequences when it writes to the db. (Which I could never get to work from kotlin)

I can’t figure out why it would work and then all the sudden break. Can the session variables change if there are more to an one connection open at once?

Thanks.

HodorTheCoder
  • 254
  • 2
  • 11
  • You face a [mojibake](https://en.wikipedia.org/wiki/Mojibake) case (*example in Python for its universal intelligibility*): `' '.encode('utf-8')` returns `b'\xf0\x9f\x91\xbd \xf0\x9f\xa5\xac'` and `' '.encode('utf-8').decode('cp1252')` -> `'👽 🥬'`. – JosefZ Jul 29 '22 at 18:45
  • Interesting. What would be converting the encoding like that in the jdbcurl when I’m explicitly setting the encoding as utf8? I wonder if there is some global variable in the DB setting it wrong. Or if I should just convert them all to Unicode escapes. – HodorTheCoder Jul 30 '22 at 04:56
  • See Mojibake in https://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored – Rick James Jul 30 '22 at 06:09
  • 1
    You must use utf8mb4 for Emoji and some of Chinese. – Rick James Jul 30 '22 at 06:10
  • Perhaps a dash is missing? `useUnicode=True&characterEncoding=utf8` --> `useUnicode=True&characterEncoding=utf-8` – Rick James Aug 01 '22 at 17:42
  • @HodorTheCoder - Please see if the missing dash is important cf https://dev.mysql.com/doc/connector-j/8.0/en/connector-j-reference-charsets.html – Rick James Aug 01 '22 at 17:49

0 Answers0