1

There are already plenty of posts about choosing the right charset for mysql, but it's again a different (and very frustrating) story for the rocksdb engine.

Firstly, I decided to use utf8-binary as charset (latin1, utf8-bin and binary are supported by myrocks) because my data may contain special chars and I want to be on the save side.

Furthermore, I am using PHP and PDO for loading data into mysql and the connection looks like this:

$pdo = new PDO('mysql:host=localhost;dbname=dbname;charset=utf8', 'user', 'password');

So I set the charset to utf8 (I also tried to use utf8_bin, but this is not supported by PDO). Although, I am able to insert some rows, sometimes I get errors like the following one:

Incorrect string value: '\xF0\x9F\x87\xA8\xF0\x9F...' for column 'column_name'

But what's the error now? This hex sequence encodes a unicode-smily (a regional indicator symbol letter c + regional indicator symbol letter n). Seems for me like valid utf8 and mysql as well as php are configured to use it.

NaN
  • 3,501
  • 8
  • 44
  • 77
  • you may need to update your post to include the full query and values being passed. You can also try using a prepared statement if you haven't already. The question's unclear. – Funk Forty Niner Jul 25 '17 at 14:41
  • the problem is that it only happens infrequently and so it is not that easy to reproduce it. But I think, that the error message is already quite good, because I know that I am not able to store unicode smileys. I read somewhere that utf8 for mysql might be the problem, because it does not handle multi-byte chars properly and utf8mb4 is better, but rocksdb does not support utf8mb4 :/ – NaN Jul 25 '17 at 14:52
  • have you seen https://stackoverflow.com/q/39463134/ --- https://stackoverflow.com/q/7814293/ --- https://stackoverflow.com/questions/279170/utf-8-all-the-way-through --- https://stackoverflow.com/q/35125933/ to name a few. – Funk Forty Niner Jul 25 '17 at 14:55
  • @Fred-ii- thanks for your effort, but as I already mentioned: utf8mb4 is not supported by rocksdb (at least I was not able to find it in the documentation), but it's the solution in all your posts... – NaN Jul 25 '17 at 15:03
  • You're welcome. I unfortunately don't know about rocksdb, so I won't be of any help there, sorry. I wish you well in finding the solution, *cheers* – Funk Forty Niner Jul 25 '17 at 15:04

1 Answers1

1

You gotta have utf8mb4, not MySQL's subset utf8.

needs a 4-byte UTF-8 encoding, hex F09F87A8.

If rocksdb does not support it, then abandon either such characters, or rocksdb. Change the charset in the PDO call, and on the columns that need it.

Rick James
  • 135,179
  • 13
  • 127
  • 222