3

I would really like for someone to take a little time and look over my code. I'm parsing some news content and I can insert the initial parse into my database which contains the news URL and the title. I'd like to expand it farther, to pass along each article link and parse the content of the article and include it in my database. The initial parsing works perfectly like this:

<?php
include_once ('connect_to_mysql.php');
include_once ('simple_html_dom.php');
$html = file_get_html('http://basket-planet.com/ru/');
$main = $html->find('div[class=mainBlock]', 0);
  $items = array();
  foreach ($main->find('a') as $m){
    $items[] = '("'.mysql_real_escape_string($m->plaintext).'",
                "'.mysql_real_escape_string($m->href).'")';
  }
$reverse = array_reverse($items);
mysql_query ("INSERT IGNORE INTO basket_news (article, link) VALUES 
             ".(implode(',', $reverse))."");
?>

As you can see, I'm using PHP Simple HTML DOM Parser. To expand, I'm trying to use mysqli statement where I can bind the parameters so all the html tags get inserted into my database. I've done this before with XML parsing. Problem is I don't know how to bind the array, and see whether my code is correct, if it will work this way... Here's the entire code:

<?php
$mysqli = new mysqli("localhost", "root", "", "test");
$mysqli->query("SET NAMES 'utf8'");
include_once ('simple_html_dom.php');
$html = file_get_html('http://basket-planet.com/ru/');
//find main news
$main = $html->find('div[class=mainBlock]', 0);
$items = array();
  foreach ($main->find('a') as $m){
    $h = file_get_html('http://www.basket-planet.com'.$m->href.'');
    $article = $h->find('div[class=newsItem]');
    //convert to string to be able to modify content
    $a = str_get_html(implode("\n", (array)$article));
      if(isset($a->find('img'))){
        foreach ($a->find('img') as $img){
          $img->outertext = '';}} //get rid of images
      if(isset($a->find('a'))){
        foreach ($a->find('a') as $link){
          $link->href = 'javascript:;';
          $link->target = '';}} //get rid of any javascript
      if(isset($a->find('iframe'))){
        foreach ($a->find ('iframe') as $frame){
          $frame->outertext = '';}} //get rid of iframes
     @$a->find('object', 0)->outertext = '';
     @$a->find('object', 1)->outertext = '';
     //modify some more to retrieve only text content
     //put entire content into a div (will if statements work here???)
     $text_content = '<div>'.$a.'<br>'.
       ($a->find('object', 0)->data > 0 ? '<a target="_blank" href="'.$a->find('object', 0)->data.'">Play Video</a>&nbsp;&nbsp;')
       ($a->find('object', 1)->data > 0 ? '<a target="_blank" href="'.$a->find('object', 1)->data.'">Play Video</a>&nbsp;&nbsp;')
       ($a->find('iframe[src*=youtube]', 0)->src > 0 ? '<a target="_blank" href="'.$a->find('iframe', 0)->src.'">Play Video</a>&nbsp;&nbsp;')
       //couple more checks to see if video links are present
    .'</div>';
$items[] = '("'.$m->plaintext.'","'.$m->href.'","'.$text_content.'")';
}
//reverse the array so the latest items have the last id
$reverse = array_reverse($items);
$stmt = $mysqli->prepare ("INSERT IGNORE INTO test_news (article, link, text_cont) VALUES (?,?,?)");
$stmt->bind_param ???; //(implode(',', $reverse));
$stmt->execute();
$stmt->close();
?>

So the logic is for every href of an article found, I'm passing it to parse the content and I'm trying to add it to the array. I probably have a ton of errors but I can't test it yet because I don't know how to bind it to see if it works. And I'm also not sure if I can do the if statements inside $text_content div...meaning to display "Play Video" if they exist. So please, if someone can take time to work on this with me I would really appreciate it.

UPDATE: changed the if statements to comparison operators in $text_content div.

Dharman
  • 30,962
  • 25
  • 85
  • 135
user2025469
  • 1,531
  • 2
  • 14
  • 26
  • it's so bad formatted..I don't see were `foreach ($main` is closed..please have a look at [PSR-2-coding-style](https://github.com/php-fig/fig-standards/blob/master/accepted/PSR-2-coding-style-guide.md) – bitWorking Apr 10 '13 at 18:15
  • I closed it...Can you please look over the code? The foreach closes after everything gets input into $items array. – user2025469 Apr 10 '13 at 18:18

1 Answers1

5

This is exactly the scenario where mysqli is really awkward. To bind multiple params, you have to pass them all as a variable-length argument list to mysql->bind_param(), but the tricky part is that you have to bind them by reference. References in PHP can be pretty confusing.

Here's an rough example (though I have not tested this exact code):

$stmt = $mysqli->prepare("INSERT IGNORE INTO test_news 
    (article, link, text_cont) VALUES (?,?,?)");
foreach ($reverse as &$value) {
  $params[] = &$value;
}
array_unshift(str_repeat('s', count($params)));
call_user_func_array(array($stmt, 'bind_param'), $params);

I find it much easier to use PDO when I want to write a general-purpose function to bind parameters to SQL. No binding is necessary, just pass an array of values to the PDOStatement::execute() method.

$stmt = $pdo->prepare("INSERT IGNORE INTO test_news 
    (article, link, text_cont) VALUES (?,?,?)");
$stmt->execute($reverse);

Update: if you need $items to contain multiple rows of data, I'd do it this way:

First, when building $items, make it an array of arrays, instead of concatenating the values together:

foreach ($main->find('a') as $m){
    $items[] = array($m->plaintext, $m->href, $text_content);
}

Then prepare an INSERT statement that inserts one row, and loop over $items executing the prepared statement once for each tuple:

$stmt = $pdo->prepare("INSERT INTO test_news 
    (article, link, text_cont) VALUES (?,?,?)");
foreach ($items as $tuple) {
    $stmt->execute($tuple);
}

I don't know why you were using array_reverse() at all, and I don't know why you were using INSERT IGNORE, so I left those out.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Hey Bill, thanks for getting back to me. Right now I'm fixing a boat-load of errors from my code. I got rid of all the queries and just outputting the content, just to make sure the code runs. I think I got it working, more or less. So I'll try your suggestion as soon as I get everything right. I'm very new, as you can tell, to mysqli and especially to PDO. PDO will break the array into three parts and execute just the way you wrote??? That easy? – user2025469 Apr 10 '13 at 18:41
  • Would this work: $stmt->bind_param('sss', $reverse[0], $reverse[1], $reverse[2]); – user2025469 Apr 10 '13 at 18:43
  • 1
    what is not mentioned here is that `VALUES` has variable length like `(?,?,?), (?,?,?), (?,?,?)` etc. so the above wont work that easy – bitWorking Apr 10 '13 at 18:45
  • Thanks for pointing that out @redreggae, I have added more to my answer. – Bill Karwin Apr 10 '13 at 18:56
  • @user2025469, yes, that suggestion seems to work, passing each element of $reverse like that. I tested it with PHP 5.3.15. I still think PDO is easier. :-) – Bill Karwin Apr 10 '13 at 19:01
  • @BillKarwin by the way..I love your book `SQL Antipatterns` especially the Closure Tables.. ;) – bitWorking Apr 10 '13 at 19:04
  • @redreggae, Thanks! I'm glad you liked my book. Someday I will have time to write another book. – Bill Karwin Apr 10 '13 at 19:07
  • @BillKarwin the reason why I was reversing is when the data gets inserted into the database, the latest article will be last. Then I can SELECT ORDER BY id DESC to get them in the right order. Also INSERT IGNORE because 'link' is unique so there will be no duplicates – user2025469 Apr 10 '13 at 19:17
  • 1
    Thanks @BillKarwin, finally got to it today and got it to work. Had to tread in uncharted waters (PDO) but found examples and got it to work. Works great! – user2025469 Apr 11 '13 at 18:35
  • Great job! I'm sure you'll be happy with PDO in the future. – Bill Karwin Apr 11 '13 at 18:45