1

I have used the following question php: sort and count instances of words in a given string

I have a table in my DB with a text field and want to do some analysis of the words in that field, but I need to combine the results

ID | Text Field
1  | happy beautiful happy lines pear gin happy lines rock happy lines pear 
2  | happy lines pear gin happy lines rock happy lines pear 

I now have an array that looks like this (but its per row)

Row 1

Array (
    [happy] => 4
    [beautiful] => 1
    [lines] => 3
    [pear] => 2
    [gin] => 1
    [rock] => 1 )

Row 2

Array (
    [happy] => 4
    [lines] => 3
    [pear] => 2
    [gin] => 1
    [rock] => 1 )

How can I do this for all the rows to combine the results - There are 30000 rows of text in the DB

Expected Results:

Array (
    [happy] => 8
    [beautiful] => 1
    [lines] => 6
    [pear] => 4
    [gin] => 2
    [rock] => 2 )
Community
  • 1
  • 1
Rob
  • 1,235
  • 2
  • 19
  • 44

5 Answers5

2

As you get each row from the DB, keep a running total

$total = array();
foreach($row as $word=>val){
    if(!isset($totals[$word])) $totals[$word] = 0;
    $totals[$word] += $val;
}
gen_Eric
  • 223,194
  • 41
  • 299
  • 337
2

I don't have your database on hand, so I'll demonstrate by stepping through an array:

[ghoti@pc ~]$ cat doit.php
#!/usr/local/bin/php
<?php

$a=array(
  '1' => "happy beautiful happy lines pear gin happy lines rock happy lines pear",
  '2' => "happy lines pear gin happy lines rock happy lines pear",
  '3' => "happy rock pear happy happy happy",
);

$wordlist=array();

foreach ($a as $index => $line) {
  foreach (explode(" ", $line) as $word) {
    $wordlist[$word]++;
  }
}

print_r($wordlist);

[ghoti@pc ~]$ ./doit.php
Array
(
    [happy] => 11
    [beautiful] => 1
    [lines] => 6
    [pear] => 5
    [gin] => 2
    [rock] => 3
)
[ghoti@pc ~]$ 

To make this go for your use case, replace the foreach() with a while loop that steps through your table:

$sql = "SELECT id,wordlist FROM yadda";
$result = db_query($sql);
while ($row = db_fetch_row($result)) {
  ...
}

I don't know what database server you're using, so I can't provide a specific example that I know will be applicable to you.

ghoti
  • 45,319
  • 8
  • 65
  • 104
1

I'd do this: create a new table, called words, pull each row out of the database, loop through it and explode the string and insert each word in the data, optionally storing data such as main table id so you can then get most additional stats about the context or the words, if your dealing with many rows and large datasets this may no be the best option

then you can use sql to build your counts etc

Eldarni
  • 425
  • 1
  • 3
  • 9
1

You can use SQL to count the number of words directly. Take a look at this previous answer:

Using SQL to determine word count stats of a text field

Community
  • 1
  • 1
lubar
  • 2,589
  • 2
  • 26
  • 28
1

PHP arrays can be used as maps. So all you have to do is get the data for each row, maintain a single array map of words as keys and their count as value. Every time you see that the key exists, just add to the count or else add new key with the corresponding count.

$grandtotal = array();
foreach($row as $key=>$val) {
 if(array_key_exists($key, $grandtotal)) {
  $grandtotal[$key] += $val;
 }
 else {
  $grandtotal[$key] = $val;
 }
}