Preamble:
I found out that Javascript and PHP has different approach to UTF-8 multibyte character codes: - PHP treats multibyte char as several separated bytes; JS treats multibyte char as a single integer (larger than 255) - PHP keeps all auxiliary bits in the codes; JS strips all those bits.
So code of Russian letter 'А' will be
208 and 144 in PHP
1040 in JS
Problem description
I need to expose a string to some encoding routine in JS in a client's browser and than decode one in PHP on a server side. To encode and decode the strings I used the JS string property charCodeAt
and PHP function chr()
. As I mentioned above this approach is not working as the codes are different in PHP and JS.
Question
Is there any function in PHP to strip auxiliary bits from UTF-8 byte sequence OR is there any function in Javascript to add those auxiliary bits to char codes?