I'm trying to insert a surrogate pair ('', \uD852\uDF62
, the same as U+24B62
from this example) into MySQL.
An INSERT
with an unescaped literal, suggested by this answer:
INSERT INTO unicode_test (value) VALUES ('');
-- or
INSERT INTO unicode_test (value) VALUES (_utf8'');
fails with
Error Code: 1366. Incorrect string value: '\xF0\xA4\xAD\xA2' for column 'value' at row 1
(note that \xF0\xA4\xAD\xA2
isn't even close to the original value of \uD852\uDF62
).
On the other hand, both
INSERT INTO unicode_test (value) VALUES (_utf16'');
and
INSERT INTO unicode_test (value) VALUES (_utf8mb4'');
succeed, but the inserted values are different from the original one.
My database uses the utf8mb4
character set, so I assume it should handle surrogates transparently.
What is the recommended way of inserting non-BMP characters into MySQL?