I have a simple PHP code to get a sentences of a text and bold an specific word.
First of all I get an array with the words that I want and their position in the text.
$all_words = str_word_count($text, 2, 'åæéø');
// $words is an array with the words that I want find.
$words_found = array();
foreach ($all_words as $pos => $word_found) {
foreach ($words as $word) {
if ($word == strtolower($word_found)) {
$words_found[$pos] = $word_found;
break;
}
}
}
Then, for every word in $words_found
I get a portion of the text with the word in the middle.
$length = 90;
foreach ($words_found as $offset => $word) {
$word_length = strlen($word);
$start = $offset - $length;
$last_start = $start + $length + $word_length;
$first_part = substr($text, $start, $length);
$last_part = substr($text, $last_start, $length);
$sentence = $first_part . '<b>' . $word . '</b>' . $last_part;
}
It works fine excepts that the text is a UTF-8
text with danish characteres (åæéø). So when $first_part
or $last_part
starts by an unicode character the susbtr string is empty.
I know mb_substr
function, so I replace my code with it.
$word_length = mb_strlen($word, 'UTF-8');
$first_part = mb_substr($text, $start, $length, 'UTF-8');
$last_part = mb_substr($text, $last_start, $length, 'UTF-8');
But with this function (mb_substr
) the position of the word ($offset
) is wrong, the new substrings ($sentence
) doesn't match as it should be.
Does it exist something like mb_str_word_count
? How can I get a the correct position of the words?