1

I have a .txt file that holds a lot of forbidden words in a forum, with the expression like:

//filterwords.txt
XXX
YYY
ZZZ

and then, I would like to use preg_match to check incoming text $str with these words; if those forbidden words are not included, we can do something; otherwise, we do another thing... I am not sure about the expression, and I just know:-

$filter_word = file("filterwords.txt")

for ($i=0; $i< count($filter_word);$i++)
{
  if(!preg_match($filter_word[$i],$str))
  {
    echo "not ok!";
    exit;
  }
  else
  {
    echo "ok!!";
    exit;
  }
}

Could experts teach me how to write the preg_match part? thankyou.

SERPRO
  • 10,015
  • 8
  • 46
  • 63
Ham
  • 804
  • 3
  • 15
  • 25
  • There are a lot of questions here at SO concerning profanity filters. For example [this one](http://stackoverflow.com/questions/273516/how-do-you-implement-a-good-profanity-filter). Are you sure you can't find anything in there to help you? – Till Helge Oct 25 '11 at 12:19
  • You're making a [clbuttic mistake](http://thedailywtf.com/Articles/The-Clbuttic-Mistake-.aspx) here. – CodeCaster Oct 25 '11 at 12:30
  • if I don't know php questions, can't I seek help from someone who is willing to teach me? Are you sure? – Ham Oct 25 '11 at 12:38

1 Answers1

1

How about this:

<?php
    $file = file_get_contents('filterwords.txt');
    $words = preg_split("#\r?\n#", $file, -1, PREG_SPLIT_NO_EMPTY);

    #Added to escape metacharacters as mentioned by @ridgerunner
    $words = array_filter("preg_quote", $words);

    $pattern = "#\b(". implode('|', $words) . ")\b#";

    if(preg_match($pattern, $str))
    {
        echo "bad word detected";
    }
?>

P.S. That's assuming that you have the text to check in the $str var

SERPRO
  • 10,015
  • 8
  • 46
  • 63
  • Almost a good answer (if the size of the file is < `64KB`), but you do need to run the wordlist through `preg_quote()` to escape any metacharacters that may be present in the "words". i.e. insert the line: `$file = preg_quote($file, `#`)` – ridgerunner Oct 25 '11 at 15:35
  • You're right, but as a simple example with normal words this could work, though I'll edit to implement the changes you point in your comment. – SERPRO Oct 25 '11 at 15:58
  • 1
    How about if the file is larger ? >5 MB <10 MB – gkns Aug 21 '13 at 08:38