0

I've attempted before to try to log into a website with curl and then get the contents of another web page, but that didn't work since the host website keeps redirecting me after trying to access the new page, this is the HTML response I get when trying to access a page after successfully logging on:

As you can see they tried to redirect me directly instead of giving over the content of the page I wanted to access.

But whatever, if that doesn't work, I was wondering if there's some way with PHP or any other possible way to get the content of a webpage, once I'm already logged in using THEIR website. Meaning, not to log in with curl, but once I already logged in on a separate tag into their website, is there some kind of easier (or harder, as long as it works) way to get the content of a page that you can only see when logged in? Again, ideally I would just rely on the fact that the user is already logged in to the other service, and based on that fact, get the new content.

Is there any way to do this?

EDIT::

I have searched google a lot and tried this code:

    <?php

      ?>
        <html>
    <head>
    </head>
    <body><?php

    $loginUrl = 'https://www.chabadone.org/platform/login/login.asp';
       $remotePageUrl = 'http://www.chabadone.org/platform/sitecontrol/admin/#page=/platform/sitecontrol/admin/calendar/month.asp';
    //These are the post data username and password
    $post_data = 'action=login&cookieexists=true&redirect=1&page=&partner=&email=something@gmail.com&password=something123&userid_to_cookie=1&saveID=yes';

//init curl
$ch = curl_init();
$USER_AGENT = $_SERVER['HTTP_USER_AGENT'];
//Set the URL to work with
curl_setopt($ch, CURLOPT_USERAGENT, $USER_AGENT);
curl_setopt($ch, CURLOPT_URL, $loginUrl);
curl_setopt($ch, CURLOPT_REFERER, $loginUrl);
// ENABLE HTTP POST
curl_setopt($ch, CURLOPT_POST, 1);

//Set the post parameters
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);

//Handle cookies for the login
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');

//Setting CURLOPT_RETURNTRANSFER variable to 1 will force cURL
//not to print out the results of its query.
//Instead, it will return the results as a string return value
//from curl_exec() instead of the usual true/false.
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

//execute the request (the login)
$store = curl_exec($ch);

//set the URL to the protected file
curl_setopt($ch, CURLOPT_USERAGENT, $USER_AGENT);
curl_setopt($ch, CURLOPT_URL, $remotePageUrl);
curl_setopt($ch, CURLOPT_POST, false);
curl_setopt($ch, CURLOPT_POSTFIELDS, "");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
//execute the request
$content = curl_exec($ch);
file_put_contents('./download.html', $content);
echo htmlentities($content);
curl_close($ch);

    ?>
    <script>

    </script>
    </body>
    </html>

but it still doesn't work at all, I still get the redirect text in the result: <!DOCTYPE html> <script language="JavaScript" type="text/JavaScript"> <!-- window.top.location = "/platform/sitecontrol/sitecontrol.asp?page=%2Fplatform%2Fsitecontrol%2Fadmin%2FDefault%2Easp%3F"; // --> </script> which is attempting to redirect me,but I need to get the actualy content of the page!

  • use cookie jar and do requests further, please do search in google to came to link with answer for it: `php curl session` – num8er Oct 07 '18 at 23:33
  • @num8er that doesn't involve me issue since I'm not recieving any errors, I only get redirected – B''H Bi'ezras -- Boruch Hashem Oct 07 '18 at 23:37
  • then put Your code to Your question – num8er Oct 07 '18 at 23:39
  • of course You cannot do that, since as You can see it uses `/#page=/platform/sitecontrol/admin/calendar/month.asp` which means they use frontend app. So only way is to do requests in browser and trace api urls which frontend app requesting. or You have to use headless browser: http://phantomjs.org/ it works as real browser and can simulate frontend actions – num8er Oct 07 '18 at 23:42
  • @num8er thanks but I don't exactly understan at all, can you please explain what I have to do a little simpler? (BTW the page its redirecting me to is the one I want to access in cause that wasnt obvious) – B''H Bi'ezras -- Boruch Hashem Oct 07 '18 at 23:44
  • SIMPLE ANSWER: You cannot achieve it with php, You need browser simulation. Simply use http://phantomjs.org – num8er Oct 07 '18 at 23:46
  • @num8er thanks I'll look into this, I never heard of it, but it will allow me to get the content of user-protected pages? (BTW I didn't notice your link in the first comment) – B''H Bi'ezras -- Boruch Hashem Oct 07 '18 at 23:47
  • Yes, it will, as I said, the page You're trying to get is rendered by frontend application which browser can understand. PhantomJS is headless browser, it means it works as browser but in terminal. – num8er Oct 07 '18 at 23:49
  • btw You've to get use of linux console and learn a bit about nodejs :D – num8er Oct 07 '18 at 23:52
  • @num8er thanks I used nodejs before and its working now in the terminal, but I sstill don't exactly know how this can help me, I'm able to save a screenshot of the webpage, but how can I use this to 1) log in 2) get the DOM content of the page? – B''H Bi'ezras -- Boruch Hashem Oct 08 '18 at 00:43
  • @num8er also I don't fully understand the original problem, if I copy and paste the /platform etc. to the beginning of the URL it shows the same page anyway – B''H Bi'ezras -- Boruch Hashem Oct 08 '18 at 00:44
  • seems like You don't know nodejs enough, You don't know how to automate browser behavior using phantomjs (headless browser). I'm saying You again: `the page You're trying to get and save is rendered by frontend application (because of navigation using #page attribute javascript there takes route and loads necessary page dynamically). php with curl cannot simulate browser behavior, cannot render frontend code`. Please sorry, but I cannot explain You more detailed, I'm not teacher to guide You through all technology. – num8er Oct 08 '18 at 07:35

0 Answers0