0

I have an array like this -

Array ( [0] => একটু [1] => আরো [2] => নয় [3] => কাছে [4] => থাকো [5] => না [6] => কাছে [7] => ডাকো [8] => না [9] => আরো [10] => কাছে [11] => আজ [12] => দুজনের [13] => প্রথম [14] => মিলনের [15] => চাঁদ [16] => তারা [17] => ঐ [18] => সাক্ষী [19] => আছে [20] => তুমি [21] => আমার [22] => আমি [23] => তোমার [24] => [25] => মন [26] => যে [27] => ভরেছে [28] => জোছনা [29] => ঝরেছে [30] => পৃথিবী [31] => দ্যাখো [32] => ঐ [33] => ঘুমিয়ে [34] => পড়েছে [35] => মন [36] => যে [37] => ভরেছে [38] => জোছনা [39] => ঝরেছে [40] => পৃথিবী [41] => দ্যাখো [42] => ঐ [43] => ঘুমিয়ে [44] => পড়েছে [45] => আমি [46] => তোমার [47] => তুমি [48] => আমার [49] => একটু [50] => আরো [51] => নয় [52] => কাছে [53] => থাকো [54] => না [55] => কাছে [56] => ডাকো [57] => না [58] => আরো [59] => কাছে [60] => আমি [61] => তোমার [62] => তুমি [63] => আমার [64] => [65] => ফুলেরই [66] => হাসিতে [67] => হাওয়ার [68] => বাঁশিতে [69] => এ [70] => রাত [71] => শেখালো [72] => ভালো [73] => যে [74] => বাসিতে [75] => আমি [76] => তোমার [77] => তুমি [78] => আমার [79] => একটু [80] => আরো [81] => নয় [82] => কাছে [83] => থাকো [84] => না [85] => কাছে [86] => ডাকো [87] => না [88] => আরো [89] => কাছে [90] => তুমি [91] => আমার [92] => আমি [93] => তোমার [94] => )

When i am counting the same values using the following code it is giving wrong results like below -

$wordCountArr = array_count_values($matchWords);

Array ( [একটু] => 1 [আরো] => 6 [নয়] => 3 [কাছে] => 6 [থাকো] => 3 [না] => 6 [ কাছে] => 3 [ডাকো] => 3 [ আজ] => 1 [দুজনের] => 1 [প্রথম] => 1 [মিলনের] => 1 [ চাঁদ] => 1 [তারা] => 1 [ঐ] => 3 [সাক্ষী] => 1 [আছে] => 1 [ তুমি] => 2 [আমার] => 5 [আমি] => 2 [তোমার] => 5 [] => 3 [ মন] => 1 [যে] => 3 [ভরেছে] => 2 [জোছনা] => 2 [ঝরেছে] => 2 [ পৃথিবী] => 2 [দ্যাখো] => 2 [ঘুমিয়ে] => 2 [পড়েছে] => 2 [ মন] => 1 [ আমি] => 3 [তুমি] => 3 [ একটু] => 2 [ ফুলেরই] => 1 [হাসিতে] => 1 [হাওয়ার] => 1 [বাঁশিতে] => 1 [ এ] => 1 [রাত] => 1 [শেখালো] => 1 [ভালো] => 1 [বাসিতে] => 1 )

But why [কাছে] => 6 and [ কাছে] => 3 ??? It suppose to be [কাছে] => 9 Now should i trim the spaces before and after or match words with spaces ? I used array_map('trim',$matchWords) but no luck! How can i fix this? Please HELP !

Nihar
  • 333
  • 1
  • 6
  • 18
  • trim removes spaces at the beginning and/or end of a string, not in the middle; you might use str_replace for that; but for UTF-8, I'd recommend using preg_replace instead – Mark Baker May 29 '16 at 18:06
  • Used $matchWords = preg_replace('/[\s\t\r\n]\+/iu', '', $matchWords); But not helping ! – Nihar May 29 '16 at 18:10
  • 1
    `trim` is a single byte function and would not work in this case. You need to use a multi byte trim function (which PHP does not have, but there is help to be found on [this page](http://php.net/manual/en/ref.mbstring.php), see the comments). – Sverri M. Olsen May 29 '16 at 18:11
  • If you don't have white space, then you're probably suffering from: http://stackoverflow.com/questions/7931204/what-is-normalized-utf-8-all-about – Anya Shenanigans May 29 '16 at 18:21

1 Answers1

1

Yes the matching is separate for those because of the space. trim() to remove spaces for each word wouldn't work for your case as it doesn't remove unicode white spaces.

So your solution use: preg_replace.

Refer:

Trim unicode whitespace

why is php trim is not really remove all whitespace and line breaks?

Community
  • 1
  • 1
Ani Menon
  • 27,209
  • 16
  • 105
  • 126
  • $matchWords = preg_replace('/^\p{Z}+|\p{Z}+$/u', '', $matchWords); Used this but still the same problem ! – Nihar May 29 '16 at 18:15
  • $matchWords = preg_replace('/^[\pZ\pC]+|[\pZ\pC]+$/u','',$matchWords); Works with this ! Thank you for the reference @Menon – Nihar May 29 '16 at 18:19
  • You may use this too: `$str = preg_replace('/^[\pZ\pC]+|[\pZ\pC]+$/u','',$str);` and `$str = preg_replace('/\s+/u', '', $str);` – Ani Menon May 29 '16 at 18:21