0

Following is my code and I am trying to scrape the following URL but for some reason the html source code is not getting scraped at all. Why is scraping not happening on this URL?

I tried to use File_get_contents as well as Simple HTML DOM library but it didn't scrape.

URL: http://www.zazzle.com/protoceratops_t_shirt-235065458404753105

function get_data($url) {
    $ch = curl_init();
    $timeout = 5;
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
    $data = curl_exec($ch);
    curl_close($ch);
    return $data;
}

echo get_data('http://www.zazzle.com/protoceratops_t_shirt-235065458404753105');
Michael Armes
  • 1,056
  • 2
  • 17
  • 31
Hamza
  • 1,593
  • 2
  • 19
  • 31

1 Answers1

1

You could try this:

function get_data($url) {
    try {
        $ch = curl_init();

        $timeout = 5;

        if (FALSE === $ch)
            throw new Exception('failed to initialize');

        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);

        $content = curl_exec($ch);

        if (FALSE === $content)
            throw new Exception(curl_error($ch), curl_errno($ch));
        // ...process $content now
        return $content;

    } catch(Exception $e) {

        trigger_error(sprintf(
            'Curl failed with error #%d: %s',
            $e->getCode(), $e->getMessage()),
            E_USER_ERROR);
    }
}

echo get_data('http://www.zazzle.com/protoceratops_t_shirt-235065458404753105');

This will also return errors, if you happen to have any.

All credit goes to: curl_exec() always returns false

Community
  • 1
  • 1
Margus Kevin
  • 195
  • 1
  • 2
  • 9