0

I am trying to parse a home page of a site, but it is accessible through redirecting from another page only, so I can only have the html of the redirecting page.

How can I get the html page of the "redirected to" page ?

the following is an example: I can get a page a.html, which when I open with browser it will redirect me to b.html, I want to parse b.html, but when I open b.html directly it will require POST parameters that can be sent from a.html to b.html when redirecting.

Edit: just for note, the "redirected to" page is has a relative path, so I do the following:

$pos=strpos($result,"window.location = \"");
$res= substr_replace ($result,"https://thecompletepath/",$pos,0);
echo $res;

and the redirecting is through a javascript code, as following:

<script type="text/javascript" charset="utf-8">
    escapeIfModal();
    LoadingScreen.start();
    window.location = "/home";
</script>
Mostafa Alayesh
  • 111
  • 1
  • 9

1 Answers1

1

You can use cURL to follow redirects as the browser would.

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "a.html");
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$a = curl_exec($ch); //response $a would contain the last redirected location: "b.html"

using file_get_contents:

$context = stream_context_create(
    array(
        'http' => array(
            'follow_location' => true
        )
    )
);

$html = file_get_contents('http://www.example.com/a.html', false, $context);
Volkan Ulukut
  • 4,230
  • 1
  • 20
  • 38
  • The redirection is through javascript code (I have updated the question), though the method didn't worked. – Mostafa Alayesh Oct 10 '16 at 12:25
  • Then I would suggest you check exactly which parameters were sent to the "redirected to" page and immitate the request exactly using curl or file_get_contents – Volkan Ulukut Oct 10 '16 at 12:27
  • So following the redirection is impossible without applying the JS code right? how can I check those parameters ? – Mostafa Alayesh Oct 10 '16 at 12:30
  • php can't execute js code. You can check which parameters were posted using a modern browser. with chrome press f12 and find b.html in the Network tab. just click it and you'll see the headers. There you can find post parameters. – Volkan Ulukut Oct 10 '16 at 12:31
  • by the way if the redirection is done using window.location, there can't be any post parameters in the request. – Volkan Ulukut Oct 10 '16 at 12:34
  • you are true, there is no POST parameters, instead it is a cookie set from the page, how can I use get it and use it? can you help me? – Mostafa Alayesh Oct 10 '16 at 12:39
  • check cookies tab in the developer console and send it like this: http://stackoverflow.com/questions/3431160/php-send-cookie-with-file-get-contents – Volkan Ulukut Oct 10 '16 at 12:41
  • Thank you so much, I have found this also http://stackoverflow.com/questions/1797510/file-get-contents-receive-cookies – Mostafa Alayesh Oct 10 '16 at 12:42