8

I have a wordpress web site.

I've created simple page template like:

<?php 
 /**
 * Template Name: Test
 */

 echo strlen('Привет');

 ?>

Then i've created a page using this template. The page shows the length of russian string 'Привет' (means 'Hello'). I expect to see 12, as UTF-8 encoded russian string consisting of 6 characters should have a size of 12 bytes, but i get 6 instead.

I've tested the same thing on other server and had correct value - 12. So i think the reason is my server configuration. I have wp 3.2.1 (i had the same problem after upgrading to wp 3.5.1) and PHP 5.3.3.

Currently i've spent about 5 days trying to find a solution, but have no luck. Does anyone know what is the reason of such behavior?

hakre
  • 193,403
  • 52
  • 435
  • 836
Vasiliy
  • 81
  • 1
  • 2

5 Answers5

7

Check the mbstring.func_overload setting in php.ini. This option allows PHP to override the strlen() function with mb_strlen() (and similarly for other equivalents). This could explain the discrepancy between your servers

EDIT

Quoting from the doc link:

To use function overloading, set mbstring.func_overload in php.ini to a positive value that represents a combination of bitmasks specifying the categories of functions to be overloaded. It should be set to 1 to overload the mail() function. 2 for string functions, 4 for regular expression functions. For example, if it is set to 7, mail, strings and regular expression functions will be overloaded.

So a value with the 2 bit set means that basic string functions will be overloaded with their mbstring equivalent, but not mail or regular expression functions; if you want normal behaviour, this should be 0

Mark Baker
  • 209,507
  • 32
  • 346
  • 385
3

Have you tried: http://lt.php.net/manual/en/function.mb-strlen.php ?

int mb_strlen ( string $str [, string $encoding ] )
Gets the length of a string.
ka_lin
  • 9,329
  • 6
  • 35
  • 56
0

Do you need to use multi-byte string functions for this? Such as http://www.php.net/manual/en/function.mb-strlen.php

gratz
  • 1,506
  • 3
  • 16
  • 34
  • 1
    It seems that the OP *wants* the number of bytes (not number of characters), which is what [`strlen`](http://php.net/manual/en/function.strlen.php) is indeed supposed to return. – Jeremy Roman Mar 11 '13 at 15:31
  • the problem is i do not use mb_strlen. I know that mb_strlen will show 6 in my case, but why strlen do that... i don't know... – Vasiliy Mar 11 '13 at 15:32
0

See http://php.net/manual/en/function.mb-strlen.php for more info about getting string length in multi-byte characters.

Bud Damyanov
  • 30,171
  • 6
  • 44
  • 52
0

My file was set to "UCS-2 BE BOM" encoding. (can be viewed from notepad++ - Encoding menu option)

I have then used mb_strlen($line,"UCS-2") function however for some reason, I was getting incorrect string length (e.g. mb_strlen("somestr","UCS-2") -> 6, where I was expecting 7)

I have changed the encoding to "UTF-8" for the file and was able to get the correct string length.

I am not sure why I was getting incorrect string length with the other encoding type, but wanted to share what worked for me.