0

Hello I have text like this :

$text = "به همين مناسبت روز يکشنبه 99/01/10 ساعت 14 الي 16 در مسجد امام صادق(ع) برگزار ميگردد.";

and When I use Split this text by lenght with str_split the return chunk have some unknown charecter , see the result :

Array ( [0] => Array ( [0] => به همين مناسبت رو� [1] => � يکشنبه 99/01/10 ساعت ) [1] => Array ( [0] => 14 الي 16 در مسجد ام [1] => ام صادق(ع) برگزار � ) [2] => Array ( [0] => �يگردد. ) )

so what can I do for touch the result like this :

Array ( [0] => Array ( [0] => به همين مناسبت روز [1] =>  يکشنبه 99/01/10 ساعت ) [1] => Array ( [0] => 14 الي 16 در مسجد ام [1] => ام صادق(ع) برگزار  ) [2] => Array ( [0] => ميگردد. ) )

The code Create this mess up! it's here :

$text = "به همين مناسبت روز يکشنبه 99/01/10 ساعت 14 الي 16 در مسجد امام صادق(ع) برگزار ميگردد.";
$lines = str_split($text, $charInLine);//$charInLine = 34
print_r(array_chunk($lines,2));
FarHooD
  • 77
  • 7
  • Perhaps this question and answer will provide you with some insight (although it is c#, it's the idea of surrogate pairs in character encoding): https://stackoverflow.com/questions/50182335/what-is-a-unicode-safe-replica-of-string-indexofstring-input-that-can-handle-s – Ibrennan208 Sep 13 '22 at 19:57
  • Using that information, you would be trying to split the text on a string rather than a specific character – Ibrennan208 Sep 13 '22 at 19:58
  • Or the index of the "string of characters" that represent one character in unicode. – Ibrennan208 Sep 13 '22 at 19:59
  • 8
    PHP's [str_split()](https://www.php.net/str_split) function basically treats strings as arrays of bytes. This is fine for ASCII text, but for anything else a single character is liable to consist of more than one byte. You should instead use the [mb_str_split()](https://www.php.net/mb_str_split) function. – r3mainer Sep 13 '22 at 20:00
  • add this as answer.@r3mainer – FarHooD Sep 20 '22 at 21:49

2 Answers2

1

You can split persian, arabic text with this code:

$chars = preg_split('//u', $text, -1, PREG_SPLIT_NO_EMPTY);
rezalaal
  • 11
  • 1
0

You need to utilize the functions in the multibyte package.