0

So I am trying to make/remake a Web Scraper in PHP using DOMDocument. The project I have to complete needs to be in native PHP, so no using cURL. I looked at using reg expressions but DOMDocument seems better.

Anyways I cannot get it to output and I am not sure why. Am I not calling something forward right?

<?php
class WebScraper{
private $url = 'http://todaysinfo.net/top-15-most-dangerous-airports/?utm_source=outbrain_airports&utm_campaign=outbrain_airports';
private $elements = array('title', 'p', 'img');
private $scraper_doc = null;

public function __construct($url){
    if($url){
        $this->url = $url;
        $this->scrapeData();
            if($this->scraper_doc){
                $this->parseData();
                $this->outPut();
        } else {
            echo '<p style="color: red;">Something happened with DOMDocument."';
        }
    }
}
function scrapeData(){
    $urlContents = @file_get_contents($this->$url);
    if($urlContents){
        $this->scraper_doc = new DOMDocument();
        libxml_use_internal_errors(TRUE);
        $this->scraper_doc->loadHTML($urlContents);
    } else {
        echo '<p style="color: red;">Didn\'t grab all of the contents."';
    }
}
function parseData(){   
    foreach($this->$elements as $element){
        $scraper_row = $this->scraper_doc->getElementsByTagName($element);
        foreach($scraper_row as $row){
            if($element == 'img'){
                echo $row->getAttribute('src') . "<br />";
            } else { 
                echo $row->nodeValue . "<br />";
            }
        }
    }
}
}
?>
Nick
  • 9
  • 2
  • 1
    stop using `@` to suppress errors. it's the (childish) programming equivalent of stuffing your fingers in your ears and going "lalalalala can't hear you". – Marc B Feb 09 '16 at 16:54
  • `@file_get_contents($this->$url);` -> `file_get_contents( $this->url );`. – Kenney Feb 09 '16 at 16:55

2 Answers2

0

This:

$urlContents = @file_get_contents($this->$url);
                                         ^

You're not accessing the $url you specified in the object definition. You're accessing $this->null, because $url is undefined in the scope of the scrapeData() method.

php > $x = new StdClass();
php > $x->foo = 'foo';
php > var_dump($x->foo);
string(3) "foo"
php > var_dump($x->$foo);
PHP Notice:  Undefined variable: foo in php shell code on line 1
PHP Fatal error:  Cannot access empty property in php shell code on line 1
Marc B
  • 356,200
  • 43
  • 426
  • 500
0

I think it will help if you change this line:

$urlContents = @file_get_contents($this->$url);

To this line:

$urlContents = @file_get_contents($this->url);

And change this line:

foreach($this->$elements as $element){

To this line:

foreach($this->elements as $element){

Then if I run your code like this for example, I get a result:

$webScraper = new WebScraper(null);
$webScraper->scrapeData();
$webScraper->parseData();

You can also check if file_get_contents is working. If it does not, maybe this page can be helpfull to you.

Community
  • 1
  • 1
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • Thanks for your response, I made those changes as above and am still not getting any output. I have another program I wrote that is much more simple that uses file_get_contents...I know the function is working on my computer. I do not see where it could be being called incorrectly in this code but am open to suggestions. – Nick Feb 10 '16 at 14:42