3

I am trying to make a simple web crawler with PHP and I am having issues getting the HTML source of a given URL. I am currently using cURL to get the source.

My code:

 $url = "http://www.nytimes.com/";

    function url_get_contents($Url) {
        if (!function_exists('curl_init')) {
            die('CURL is not installed!');
        }
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, $Url);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        $output = curl_exec($ch);
        if ($output === false) { die(curl_error($ch)); }
        curl_close($ch);
        return $output;
    }

    echo url_get_contents($url);
    ?>

Right now nothing gets echoed and there aren't any errors, so it is a bit of a mystery. Any suggestions or fixes will be appreciated

Edit: I added

if ($output === false) { die(curl_error($ch)); }

to the middle of the function and it ended up giving me an error (finally!):

Could not resolve host: www.nytimes.com

I still do not really know what the problem is. Any ideas?

Thanks

Matt Carey
  • 51
  • 6

2 Answers2

2

Turns out that it was not a cURL problem

My host server (Ubuntu VM) was working off of a "host-only" network adapter which blocked access to all other IPs or domains outside of it's host machine making it impossible for cURL to connect to URLs.

Once it was changed to "bridged" network adapter I had access to the outside world.

Hope this helps.

Matt Carey
  • 51
  • 6
0

Variable case mismatch ($url vs. $Url). Change:

function url_get_contents($Url) {

to

function url_get_contents($url) {
Asaph
  • 159,146
  • 25
  • 197
  • 199
  • The two variables are used in different context, inside and outside the function. Plus the edited question shows that the url is corectly read. – Alvaro Flaño Larrondo Jun 25 '15 at 22:56
  • 1
    @AlvaroFlañoLarrondo This answer was posted prior to the question edit at a time where the variable names *did not align within the function*. I was keenly aware that there are 2 variables in two different contexts. – Asaph Jun 25 '15 at 23:44