Download HTML of any website and echo it in PHP

Question

edited this post, i think my problem was explained wrong:

i want my php page to download the HTML code from any page, and i actualy can expect one, "http://www.lolnexus.com/" and i made this code

$nexus= file_get_contents('http://www.lolnexus.com/');

$myFile = "nexus.txt";
$fh = fopen($myFile, 'w') or die("can't open file");
fwrite($fh, $nexus);
fclose($fh);

and the result is: this

<html><head><title>Object moved</title></head><body>

Object moved to here.

if i change my url to other webpage, it works fine.....

thanks for reading

I suspect you are getting a redirect screen, but you've not asked cURL to follow redirects. See [this answer](http://stackoverflow.com/questions/3519939/make-curl-follow-redirects). — halfer, Nov 17 '13 at 13:20
thanks for replying :) >http://www.lolnexus.com/<, thats a exemple, would be nice having all his html on a txt or in a variable — Vítor Campeã, Nov 17 '13 at 13:45

score 5 · Answer 1 · answered Nov 17 '13 at 13:20

5

You must use the option CURLOPT_FOLLOWLOCATION

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);

This let's curl follow any redirects issued from the web server.

answered Nov 17 '13 at 13:20

Olaf Dietsche

72,253
8
102
198

thanks for replying, i just want to take the html code of that page, i dont know what that line does, but i dont want to redirect to that page, i got some error.., Warning: curl_setopt() [function.curl-setopt]: CURLOPT_FOLLOWLOCATION cannot be activated when safe_mode is enabled or an open_basedir is set in /home/a8466706/public_html/index.php on line 11 – Vítor Campeã Nov 17 '13 at 13:47
Please see this question http://stackoverflow.com/q/6918623/1741542 – Olaf Dietsche Nov 17 '13 at 14:50

Bunny Huggan · Accepted Answer · 2013-11-17T19:34:16.210

0

Not sure if this will help, but if you are not needing to POST any data to the page you are requesting the HTML from you could also use the file_get_contents function.

$html = file_get_contents('www.google.com');

Then the HTML for that site will be stored as text in the $html variable.

Addendum:

I have tested the following code and both methods are working:

<?php
## Define url to grab
$url = 'http://www.lolnexus.com';

## Method 1: simple file_get_contents
$html = file_get_contents($url);

## Method 2: using cURL
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$html = curl_exec($ch);

## Write output to file.
$fh = fopen('html_output.txt', 'w') or die("can't open file");
fwrite($fh, $html);
fclose($fh);

## Output something to the page.
echo 'Done!';
?>

Note that when using the cURL method ommitting the CURLOPT_FOLLOWLOCATION or setting it to FALSE results in getting the redirect message. oh and comment out method 1 or method 2 depending on which one you want to test.

edited Nov 17 '13 at 19:34

answered Nov 17 '13 at 14:11

Bunny Huggan

22
6

well i tried that too, #code#$html = file_get_contents('http://www.lolnexus.com/'); echo $html;#code# #output#: "Object moved to here." – Vítor Campeã Nov 17 '13 at 14:18
returns this ##Object moved
Object moved to here.
## – Vítor Campeã Nov 17 '13 at 14:36
cURL may be the better option for you then with the options set as described by other users to instruct cURL to follow any redirects. Interestingly though, when I do $html = file_get_contents('http://www.lolnexus.com'); I can write the page source to file with no problems and interrogate it. Is this one of the sites you are having issues with the redirects?. – Bunny Huggan Nov 17 '13 at 17:43
yes it is, only in this website – Vítor Campeã Nov 17 '13 at 18:53
Its odd that I am not having the same problems. However. The problem is that the page is redirecting you and for whatever reason your scripts are not following the redirect (cURL should handle this with no problems). Could you not simply work out where the page is redirecting you to and use that URL instead to potentially bypass the redirect? I don't know the full scope of your project but if you are querying the same page(s) this is not a guarantee but could be a very simple solution. – Bunny Huggan Nov 17 '13 at 19:16
realy? u have no problem? i am doing from a server..maybe is that...? thats stupid – Vítor Campeã Nov 17 '13 at 19:45
u know 1 thing? i just isntalled the xampp on my laptop to be a server and my old and your code works, the only diferente was that i was doing it from a server....what could it be and fix it? i dont want my laptop to be a server xD – Vítor Campeã Nov 17 '13 at 20:05
The server that you are having issues on is a shared server by some kind of hosting company? perhaps a re-seller? they often have restrictions in place to patch up / mitigate some security issues which could be the problem you are having. unfortunately if this is the problem there might not be a lot you can do besides purchasing a dedicated server or asking nicely if there is any way they can change the php.ini file (this is doubtful). Anyway, going off topic now. you said that putting in the destination URL solved the problem so great! :) – Bunny Huggan Nov 17 '13 at 20:48
yes it is one free hosting service, dang...i need to find a new server... thanks for help, by the way, that lolnexus.com has some "loading"(1-2seconds) and if i get the html, the info is not there, only after loading ends, there is any way to wait that loading ends and then fetch the data i want? – Vítor Campeã Nov 17 '13 at 21:51

score 0 · Answer 3 · answered Nov 17 '13 at 14:16

0

In order to have curl_exec return the output instead of echoing it, add this one:

curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);

answered Nov 17 '13 at 14:16

matpop

1,969
1
19
36

Download HTML of any website and echo it in PHP

Object moved to here.

3 Answers3

Object moved to here.