1

Trying to check if user's input does not contain banned words

Created array, for example $prohibited_words = array{'hell'};

Then user's input $input = 'Hello!';

And php

foreach ($prohibited_words as $one_prohibited_word) {
  if( stristr( substr( $input ), $one_prohibited_word ) !== FALSE ) {
    $error_message_words = 1;
    echo $one_prohibited_word. ' $one_prohibited_word <br/>';
  }
}

Here Hello appears as banned.

May change something like if( stristr( substr( $input ), ($one_prohibited_word.' ') ) !== FALSE ) {

But may be some better solution?

Update Created possibly a bit crazy code...

$bad_words_arr = array{'bad'};

Visitor's input

$user_input = ' Hello b a d word go  g g g on  a a a  again';

At first i replace whitespaces. If more than one whitespace, then replace with one whitespace (i believe in normal situation in a sentence must not be more than one white space).

Then create array based on whitespaces.

Here i create new array without empty elements and changing key order.

foreach( explode (' ', str_replace('  ',  ' ', $user_input) ) as     $value_new_arr ){
  if( strlen(trim($value_new_arr)) > 0 ){
    $arr_from_message_textarea[] = trim($value_new_arr);
  }
}

Then loop trough each element of array.

If string length for particular element is one (only one letter/word from one letter), then check number of letters in the previous loop. If in previous loop also one letter, then put them together and create new word.

$temporary_string = '';
$new_word_already_started = 0;
foreach( $arr_from_message_textarea as $k_arr_from_message_textarea => $v_arr_from_message_textarea ){

if( $k_arr_from_message_textarea > 0 ){

  if( strlen($v_arr_from_message_textarea) == 1  ){//current element has only one letter

    if( strlen( $arr_from_message_textarea[$k_arr_from_message_textarea-1] ) == 1 ){//and previous element has only one letter

      if( $new_word_already_started == 0 ){
      $temporary_string = $arr_from_message_textarea[$k_arr_from_message_textarea-1]. $v_arr_from_message_textarea;//I started to create word. Put together two first letters
      $new_word_already_started = 1;
      }
      else if( $new_word_already_started == 1 ) {
      $temporary_string = $temporary_string.$v_arr_from_message_textarea;//Take previous letters 
      }

    }//if( strlen( $arr_from_message_textarea[$k_arr_from_message_textarea-1] ) == 1 ){
    else if( strlen( $arr_from_message_textarea[$k_arr_from_message_textarea-1] ) > 1 ){
    $temporary_string = '';//reset temporary string to start to use empty string
    $new_word_already_started = 0;
    }

  }//if( strlen($v_arr_from_message_textarea) == 1  ){

  else if( strlen($v_arr_from_message_textarea) > 1  ){//Imagine, in some previous loops were only one letter, created $temporary_string. In this loop word has more than one letter. So word from previous loops is created. Pass the word to array $new_word and reset $temporary_string to empty.
    if( strlen($temporary_string) > 0 ){
    $new_word[] = $temporary_string;
    }
  $temporary_string = '';
  $new_word_already_started = 0;
  }

}//if( $k_arr_from_message_textarea > 0 ){

}//foreach

if( isset($new_word) ){
//Merge initial array with the new array
$arr_to_check_bad_words = array_merge( ( explode (' ', str_replace( '  ',  ' ', str_replace('  ',  ' ', $user_input ) ) ) ), $new_word );
}
else{
$arr_to_check_bad_words = explode (' ', str_replace( '  ',  ' ', str_replace('  ',  ' ', $user_input ) ) );
}

$result_bad_words = array_intersect( $arr_to_check_bad_words, $bad_words_arr );// As understand get array of words from $arr_to_check_bad_words if the words exist in $bad_words_arr

Unfortunately this does not work if $user_input = ' Hello ba d go g g g on a a a again';

It appears very difficult to get required result. And after spending lot of time to write code, can see that visitor can send message with text from which can easily understand that it contains "bad words/bad meaning". So conclusion, if i already have some code that filters something, i will use the code. But otherwise seems unreasonable to try to create some code. In another way must filter spammers and bad text senders.

Andris
  • 1,434
  • 1
  • 19
  • 34
  • possible duplicate of [How do you implement a good profanity filter?](http://stackoverflow.com/questions/273516/how-do-you-implement-a-good-profanity-filter) – Gopakumar Gopalan Feb 22 '15 at 10:11
  • 2
    Don't miss [The Telegraph article](http://www.telegraph.co.uk/news/newstopics/howaboutthat/2667634/The-Clbuttic-Mistake-When-obscenity-filters-go-wrong.html) :] – Jonny 5 Feb 22 '15 at 10:27
  • This has been answered before http://stackoverflow.com/questions/19358774/php-swear-word-filter – Treasure Priyamal Feb 22 '15 at 10:46

1 Answers1

4

Maybe do something like this, where you split the input into an array of words, and then check to see if any words in the input array are also present in the $prohibited_words array:

$prohibited_words = array('Hello!');
$input = 'Hello!';
$input_array = explode(' ', $input);
$intersect = array_intersect($prohibited_words, $input_array);

foreach ($intersect as $item) {
    echo "$item <br/>";
}

A more robust solution would involve removing all of the spaces from the input string, and then checking to see if any of the items in the $prohibited_words array appears in the edited input string. You could also use strtolower() to make the search case-insensitive:

$prohibited_words = array('Hello!');    
$input = 'He ll   o!';
$new_input = str_replace(' ', '', strtolower($input));

foreach($prohibited_words as $item) {
    if (strpos($new_input, strtolower($item)) !== false) {
        echo "$item <br/>";
    }
}
Conor Taylor
  • 2,998
  • 7
  • 37
  • 69
  • Seems not easy solution for the situation. For example if input `He ll o how are you!`. May be at first create array like in your answer `explode(' ', $input)`. Then if for some array element only one character, then check following array element. If also one character, then put together. But... also no solution – Andris Feb 22 '15 at 10:49
  • That input would fail because of the exclamation mark. You could choose to either remove the exclamation mark from 'Hello!' in $prohibited_words, or also decide to remove all punctuation from $input, so that words would be matched regardless of their punctuation – Conor Taylor Feb 22 '15 at 10:51
  • Where did you remove it from? – Conor Taylor Feb 22 '15 at 10:54
  • Like this `$prohibited_words = array('Hello'); $input = 'He ll o how are you';` – Andris Feb 22 '15 at 10:55
  • Ah, I had the order of the arguments wrong in strpos(). Check out my edit – Conor Taylor Feb 22 '15 at 10:58
  • Seems if `$input` contains one letter (one character) then imediatelly following another letter with one character like (`h e l l`), then must remove white spaces between all the characters. – Andris Feb 22 '15 at 11:32
  • Use str_replace to take out the spaces, use a regular expression for possible numbers instead of letters and another regular expression to remove non letter/number characters & then validate using strpos – Daryl Gill Feb 22 '15 at 13:29