26

How can I get the html source code of http://www.example-webpage.com/file.html without using file_get_contents()?

I need to know this because on some webhosts allow_url_fopen is disabled so you can't use file_get_contents(). Is it possible to get the html file's source with cURL (if cURL support is enabled)? If so, how? Thanks.

Czar Pino
  • 6,258
  • 6
  • 35
  • 60
John Paneth
  • 367
  • 2
  • 6
  • 9

4 Answers4

44

Try the following:

$ch = curl_init("http://www.example-webpage.com/file.html");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
$content = curl_exec($ch);
curl_close($ch);

I would only recommend this for small files. Big files are read as a whole and are likely to produce a memory error.


EDIT: after some discussion in the comments we found out that the problem was that the server couldn't resolve the host name and the page was in addition a HTTPS resource so here comes your temporary solution (until your server admin fixes the name resolving).

what i did is just pinging graph.facebook.com to see the IP address, replace the host name with the IP address and instead specify the header manually. This however renders the SSL certificate invalid so we have to suppress peer verification.

//$url = "https://graph.facebook.com/19165649929?fields=name";
$url = "https://66.220.146.224/19165649929?fields=name";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Host: graph.facebook.com'));
$output = curl_exec($ch);
curl_close($ch); 

Keep in mind that the IP address might change and this is an error source. you should also do some error handling using curl_error();.

ashleedawg
  • 20,365
  • 9
  • 72
  • 105
The Surrican
  • 29,118
  • 24
  • 122
  • 168
  • John, if this isn't working then check your URL. Also, don't forget the curl_close($ch) at the end. – Brad Aug 28 '10 at 20:29
  • Does it work with a plain text file instead of a html file? I tested it with a plain text file - and I get a blank page. – John Paneth Aug 28 '10 at 20:41
  • youre right closing up curl is not a bad idea, ill investigate the use case with the text file. maby you have an url for me (because theres practically no difference but there may be another error...)? – The Surrican Aug 28 '10 at 21:02
  • okay downlaoding http://www.facebook.com/robots.txt worked fine, can you give me the url that doesnt work? – The Surrican Aug 28 '10 at 21:05
  • try this please: https://graph.facebook.com/19165649929?fields=name that does not work for me. Obviously it's also accessable via "http" – John Paneth Aug 28 '10 at 22:33
  • its https, not http. it works here with the example above but the ssl settins may be version specific! please try this: var_dump(curl_error($ch)); before curl_close and tell me what it outputs! – The Surrican Aug 28 '10 at 22:46
  • so your server cant resolve the ip adress. you should contact your server administrator he should set up correct dns resolving. theres nothing wrong with your code. the only solution i now know or you without correcting this server issue is to directly get the data from the ip adress and send the host header, but you will have to deal with ssl warnings. by the way it would be nice if you upvoted this :) – The Surrican Aug 28 '10 at 23:04
  • 1
    dont give up just yet, i modified the answer and posted a temporary solution for you! – The Surrican Aug 28 '10 at 23:20
  • `CURLOPT_BINARYTRANSFER` is no longer required (since version 5.1.3, which I really hope no one is using). – Leo Galleguillos Dec 12 '18 at 13:53
3
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec($curl);
curl_close($curl);

Source: http://www.christianschenk.org/blog/php-curl-allow-url-fopen/

Brad
  • 159,648
  • 54
  • 349
  • 530
3

Try http://php.net/manual/en/curl.examples-basic.php :)

<?php

$ch = curl_init("http://www.example.com/");
$fp = fopen("example_homepage.txt", "w");

curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);

$output = curl_exec($ch);
curl_close($ch);
fclose($fp);
?>

As the documentation says:

The basic idea behind the cURL functions is that you initialize a cURL session using the curl_init(), then you can set all your options for the transfer via the curl_setopt(), then you can execute the session with the curl_exec() and then you finish off your session using the curl_close().

phidah
  • 5,794
  • 6
  • 37
  • 58
1

I found a tool in Github that could possibly be a solution to this question. https://incarnate.github.io/curl-to-php/ I hope that will be useful

Ahmet Sina Ustem
  • 1,090
  • 15
  • 32