Recently, I came across a weird behaviour while testing a backend system which uses MySQL.
In my frontend, when I printed the length of a string, I got a different value than from the MySQL.
For the test, I'm using this 0123456789
, and
♀️
emojis to check the length of the string.
Javascript(In the browser console):
> "♀️".length
<- 7
> "".length
<- 2
> "0123456789".length
<- 10
MySQL(Using MySQL Workbench):
CREATE TABLE testing_char_length
(
id INT auto_increment PRIMARY KEY,
txt VARCHAR(10)
)
DEFAULT charset utf8mb4;
INSERT INTO testing_char_length
(txt)
VALUES ('0123456789'),
(''),
('♀️');
SELECT *,
Char_length(txt) NumberOfChars,
Length(txt) SizeInBytes
FROM testing_char_length;
DROP TABLE testing_char_length;
Output:
| id | txt | NumberOfChars | SizeInBytes |
|---- |------------ |--------------- |------------- |
| 1 | 0123456789 | 10 | 10 |
| 2 | | 1 | 4 |
| 3 | ♀️ | 5 | 17 |
Initially I thought it might be due to some encoding issues. So, tried to search about the encodings used by Javascript and MySQL.
So, I also tried setting the MySQL encoding to utf16
to be sure it is not the culprit and even after using default charset utf16
, the output is same like in utf8mb4
.
Why is there a difference in the number of characters in Javascript and MySQL?
I'm sure there is something which I haven't come across till now and hoping for some answers for this behaviour.
Tested on:
Google Chrome Version 81.0.4044.122 (Official Build) (64-bit)
MySQL Workbench: 8.0.19
MySQL: 8.0.19
OS: macOS Catalina version 10.15.3
Update: 1
The HEX codes of string "0123456789"
are same in both JS and MySQL. Other two are not same.
JS Fiddle for Hex values. Code to calculate HEX is taken from here
MySQL Fiddle for Hex values (Taken from the comment below)
Even though the encoding is same, why are the HEX values are different? Isn't UTF encoding supposed to be universal and same?