2

I have got an array with several twitter tweets and want to delete all tweets in this array which contain one of the following words blacklist|blackwords|somemore

who could help me with this case?

Lupo
  • 2,884
  • 5
  • 24
  • 30
  • 1
    Could you display some of the code you have tested or are trying to use? how about posting the array format with the tweets? Could be useful – Phill Pafford Jan 07 '11 at 18:10
  • 1
    how is the blacklist|blackwords|somemore stored? DB, array, files... – maid450 Jan 07 '11 at 18:17
  • Without knowing the format of your array and the format of your blacklist, this question is rather hard to answer. – mfonda Jan 07 '11 at 18:19
  • 1
    *(related)* [How do you implement a good profanity filter](http://stackoverflow.com/questions/273516/how-do-you-implement-a-good-profanity-filter) – Gordon Jan 07 '11 at 18:24

5 Answers5

6

Here's a suggestion:

<?php
$banned_words = 'blacklist|blackwords|somemore';
$tweets = array( 'A normal tweet', 'This tweet uses blackwords' );
$blacklist = explode( '|', $banned_words );

//  Check each tweet
foreach ( $tweets as $key => $text )
{
    //  Search the tweet for each banned word
    foreach ( $blacklist as $badword )
    {
        if ( stristr( $text, $badword ) )
        {
            //  Remove the offending tweet from the array
            unset( $tweets[$key] );
        }
    }
}
?>
Kalessin
  • 2,282
  • 2
  • 22
  • 24
  • I'm not a huge fan of checking against a string for blacklisted words - makes the code awkward to maintain, and array is easier to manage. – electblake Jan 07 '11 at 18:26
  • @electblake: Sure, but as others point out, we have no idea what the source of the blacklist is, so I just wrote something simple, to clearly show where data was coming from and how it was analysed. Hence why I *didn't* use array_filter ;) – Kalessin Jan 07 '11 at 18:32
  • yeah - good point :) I tend to assume the simplest situation or at least ideal circumstances - which probably isn't a good thing! – electblake Jan 07 '11 at 18:39
4

You can use array_filter() function:

$badwords = ... // initialize badwords array here
function filter($text)
{
    global $badwords;
    foreach ($badwords as $word) {
        return strpos($text, $word) === false;
    }
}

$result = array_filter($tweetsArray, "filter");
Kel
  • 7,680
  • 3
  • 29
  • 39
4

use array_filter

Check this sample

$tweets = array();

function safe($tweet) {
    $badwords = array('foo', 'bar');

    foreach ($badwords as $word) {
        if (strpos($tweet, $word) !== false) {
            // Baaaad
            return false;
        }
    }
    // OK
    return true;
}

$safe_tweets = array_filter($tweets, 'safe'));
Xavier Barbosa
  • 3,919
  • 1
  • 20
  • 18
2

You can do it in a lot of ways, so without more information, I can give this really starting code:

$a = Array("  fafsblacklist hello hello", "white goodbye", "howdy?!!");
$clean = Array();
$blacklist = '/(blacklist|blackwords|somemore)/';

foreach($a as $i) {
  if(!preg_match($blacklist, $i)) {
    $clean[] = $i;
  }
}

var_dump($clean);
Peter Porfy
  • 8,921
  • 3
  • 31
  • 41
1

Using regular expressions:

preg_grep($array,"/blacklist|blackwords|somemore/",PREG_GREP_INVERT)

But i warn you that this may be inneficient and you must take care of punctuation characters in the blacklist.

hectorct
  • 3,335
  • 1
  • 22
  • 40