0

I want to show the density of words in an Arabic text. The following code works for English characters but does not support the Arabic text. How can I specify the density of Arabic words in the text?

<?php
$str = "I am working on a project where I have to find out the keyword density of the page on the basis of URL of that page. But I am not aware actually what \"keyword Density of a page\" actually means? and also please tell me how can we create a PHP script which will fetch the keyword density of a web page.";

// str_word_count($str,1) - returns an array containing all the words found inside the string
$words = str_word_count(strtolower($str),1);
$numWords = count($words);

// array_count_values() returns an array using the values of the input array as keys and their frequency in input as values.
$word_count = (array_count_values($words));
arsort($word_count);

foreach ($word_count as $key=>$val) {
    echo "$key = $val. Density: ".number_format(($val/$numWords)*100)."%<br/>\n";
}
?>

Example output:

of = 5. Density: 8%
a = 4. Density: 7%
density = 3. Density: 5%
page = 3. Density: 5%
...
  • Does this answer your question? [str\_word\_count() function doesn't display Arabic language properly](https://stackoverflow.com/questions/13884178/str-word-count-function-doesnt-display-arabic-language-properly) – Jax-p May 30 '22 at 07:21
  • The first step is to understand the problem. You say that you do not yet. This is why you are struggling to get a solution. You will need to know what the word separator is in Arabic (is there one?). The to the task one step at a time, start with getting a list of words. – ctrl-alt-delor May 30 '22 at 07:22
  • Have you considered not using PHP. I hear that the only remaining big users of it is Word Press and Drupal. – ctrl-alt-delor May 30 '22 at 07:25
  • 1
    @ctrl-alt-delor Then you are listening to some very strange sources for your information – RiggsFolly May 30 '22 at 07:31

1 Answers1

1

The problem is str_word_count doesn't count Arabic characters as "word characters". You can either pass the "word characters" you need as the third argument or just explode the string and count the words using a for loop.

Hassan Pezeshk
  • 343
  • 5
  • 16