0

I m trying to match the Hebrew banned words retrieved from the mysql db table with Hebrew string in $_POST['content'], for English words (if used in Hebrew string $_POST['content']) its giving the match but for Hebrew words no luck. Can you help me to modify the code below to search a banned Hebrew word in a given string? All the source of data has been checked its in UTF-8 format.

<?

$banned_words=array();
while($loc=mysql_fetch_array($loc_query))
{
    $banned_words[$k]=stripslashes(utf8_decode($loc["sb_word"]));
    $k=$k+1;
}

$matches = array();
$matchFound = preg_match_all(
    "/\b(" . implode($banned_words,"|") . ")\b/u", 
    $_POST['content'], 
    $matches
    );

if ($matchFound)
{  
    $words = array_unique($matches[0]);   
    $word_status=1;
    }
?>
wallyk
  • 56,922
  • 16
  • 83
  • 148
  • 1
    [`implode`](http://php.net/implode) works the other way round. You need to give the glue character first, then the array variable. – mario Mar 05 '12 at 20:15
  • 2
    @mario implode() can, for historical reasons, accept its parameters in either order. For consistency with explode(), however, it may be less confusing to use the documented order of arguments. – Andrew Hall Mar 05 '12 at 20:19

1 Answers1

1

\b is not unicode-aware, you should use Unicode character properties. See this answer for some help

Community
  • 1
  • 1
dev-null-dweller
  • 29,274
  • 3
  • 65
  • 85