I am trying to get the length of this unicode characters string
$text = 'نام سلطان م';
$length = strlen($text);
echo $length;
output
20
How it determines the length of unicode characters string?
strlen()
is not handling multibyte characters correctly, as it assumes 1 char equals 1 byte, which is simply invalid for unicode. This behavior is clearly documented:
strlen() returns the number of bytes rather than the number of characters in a string.
The solution is to use mb_strlen()
function instead (mb
stands for multi byte
) (see mb_strlen() docs).
Function strlnen
does not count the number of characters, but the number of bytes. For multibyte characters it will return higher numbers.
Use mb_strlen()
instead to count the actual count of characters.
Just as an addendum to the other answers that reference mb_strlen()
:
If the php.in
setting mbstring.func_overload
has bit 2 set to 1, then strlen
will count characters based on the default charset; otherwise it will count the number of bytes in the string