0

I have a query like this:

$content = "How technology is helping to change the way people think about the food on their plate and the food impact for them. Technology could have a role to play in raising awareness of the impact our diets have on the planet.";

$exp = explode(" ", $content)

for($j = 0; $j < count($exp); $j++){
    $this->db->query("INSERT INTO news (news_id, news_content) VALUES ('$id', $exp[$j])");
}

But, I don't want to insert all the words, i just need to insert the word that only appear more than once (technology, food, impact). Is it possible to do that? can someone help me?

man99
  • 47
  • 8
  • 1
    Almost everything is possible. You should parameterize. You could use a regex to capture all text between `**`s. – user3783243 Apr 07 '20 at 04:51
  • 1
    do you mean the word appears more than once or the word is within `** **`? because a lot of words appear more than once like 'to', 'the', 'on', 'have' – Andy Song Apr 07 '20 at 04:57
  • 1
    [How can I prevent SQL injection in PHP?](https://stackoverflow.com/questions/60174/how-can-i-prevent-sql-injection-in-php) is the same as "How to insert data and not care about its special character that may confuse SQL". Welcome to SO and learning SQL. – danblack Apr 07 '20 at 05:01

2 Answers2

0

I would process the text content using array_filter to exclude words that are in a stopword list, then count the occurrences of each word using array_count_values and then array_filter out the words that occur only once. You can then write the remaining words (which will be the keys of the output array) to the database. For example:

$content = "How technology is helping to change the way people think about the food on their plate and the food impact for them. Technology could have a role to play in raising awareness of the impact our diets have on the planet.";

$stopwords = array('how', 'is', 'to', 'the', 'way', 'on', 'and', 'for', 'a', 'in', 'of', 'our', 'have');

// count all words in $content not in the stopwords list
$counts = array_count_values(array_filter(explode(' ', strtolower($content)), function ($w) use ($stopwords) {
    return !in_array($w, $stopwords);
}));
// filter out words only seen once
$counts = array_filter($counts, function ($v) { return $v > 1; });
// write those words to the database
foreach ($counts as $key => $value) {
    $this->db->query("INSERT INTO news (news_id, news_content) VALUES ('$id', '$key')");
}

For your sample data, the final result in $counts will be:

Array
(
    [technology] => 2
    [food] => 2
    [impact] => 2
)
Nick
  • 138,499
  • 22
  • 57
  • 95
  • Sorry for late response, i already try it and it works! the array_count_values and array_filter really help a lot. Thanks – man99 Apr 10 '20 at 04:20
0

There are a lot of options here I believe.

Here is my solution(s): You could use search_array() for this. The search array returns false if no other needle is found in_array in an array. If another word is found it returns the key.

Depending of your needs, you could use one of these options below.

//Option 1
//Words that actually appear more than once... 
$new_arr = array();
foreach($exp as $key=>$e) {
    //Must be this word only (therefore the true-statement
    $search = array_search($e, $exp, true); 
    if ($search !== false && $search != $key) {        
        $new_arr[] = $e;
    }
}

//Option 2
//
//Your question was not totally clear so I add this code as well
//Words with asterixes before and after that appear more than once
$new_arr = array();
foreach($exp as $key=>$e) {

    //Two asterixes at the beginning of the sting and two at the end
    //strtolower sets **Technology** and **technology** as a duplicate of word
    if (substr($e,0,2) == "**" && substr($e,-2,2) == "**") { 
        $search = array_search(strtolower($e), $exp);
        if ($search !== false && $search != $key) {        
            $new_arr[] = $e;
        }
    }
}

for($j = 0; $j < count($new_arr); $j++){
    $this->db->query("INSERT INTO news (news_id, news_content) 
    VALUES ('$id', $new_arr[$j])");
}

As someone mentioned in a comment you should prevent SQL injections by typing that way in the INSERT-statement (and you should), but the question was mainly about finding duplicates in a string in do something with them so therefore I won't go further with that comment.

The result array $new_arr would like: (option 1)

array (size=9)
  0 => string 'the' (length=3)
  1 => string 'the' (length=3)
  2 => string '**food**' (length=8)
  3 => string 'to' (length=2)
  4 => string 'the' (length=3)
  5 => string '**impact**' (length=10)
  6 => string 'have' (length=4)
  7 => string 'on' (length=2)
  8 => string 'the' (length=3)

The reason why Technology and technology is not the same because that its an uppercase T in one of the words.

The result array $new_arr would like: (option 2)

array (size=3)
  0 => string '**food**' (length=8)
  1 => string '**Technology**' (length=14)
  2 => string '**impact**' (length=10)
bestprogrammerintheworld
  • 5,417
  • 7
  • 43
  • 72