0

Why result from ord PHP not same as result from charCodeAt javascript ?

Result from PHP is 230 143 144

And result from javascript is 25552

How to apply php code to get result as same as javascript result ?

.

.

javascript

<script>
var someString = "提";

for(var i=0;i<someString.length;i++) {
    var char = someString.charCodeAt(i);
    alert(char);
}
</script>

PHP

<?php
$s = '提';

for ( $i = 0; $i < strlen( $s ); $i++ ) {
    print ord( $s[ $i ] ) . "\n";
}
?>
Ultimater
  • 4,647
  • 2
  • 29
  • 43

1 Answers1

0

Because there are like a hundred different ways to encode text in a computer. Additionally:

  • PHP does not really know that encoding your data is using
  • JavaScript did not really implement Unicode correctly until recent versions (though this is not relevant in this particular case)

Your character (提) is listed in the Unicode catalogue as 'hold in hand; lift in hand' (U+63D0) and has, among many others, the following encodings:

  • AS UTF-8: 0xE6 0x8F 0x90
  • AS UTF-16: 0x63D0 (25552 in decimal)

Your PHP file appears to be saved as UTF-8 (that's something you can check in your text editor) thus 提 is encoded in three bytes but your code splits the single character the individual bytes.

Your JavaScript function, however, prints the UTF-16 encoding as documented.

PHP provides a couple of builtin functions to convert between encodings:

$as_utf8 = '提';
var_dump( unpack('n', mb_convert_encoding($as_utf8, 'UTF-16BE', 'UTF-8')) );
var_dump( unpack('n', iconv('UTF-8', 'UTF-16BE', $as_utf8)) );
Álvaro González
  • 142,137
  • 41
  • 261
  • 360