0

So I have been trying to solve this for the past 3 days and just can't know why. I'm trying to access the html of this site that requires login first.

I tried everyway I could and all return with the same problem.

Here is what I tried:

response = requests.get('https://de-legalization.tlscontact.com/eg/CAI/myapp.php', headers=headers, params=params, cookies=cookies)
print(response.content)
payload = {
    '_token': 'TOKEN HERE',
    'email': 'EMAIL HERE',
    'pwd': 'PASSWORDHERE',
    'client_token': 'CLIENT_TOKEN HERE'
}

with requests.session() as s:
    r = s.post(login_url, data=payload)
    print(r.text)

I also tried using URLLIB but they all return this:

<script>window.location="https://de-legalization.tlscontact.com/eg/CAI/index.php";</script>

Anyone knows why this is happening. Also here is the url of the page I want the html of: https://de-legalization.tlscontact.com/eg/CAI/myapp.php

  • All that page seems to do is redirect you to another page using javascript. Browsers will execute the js code and go to the page referenced by the code. – Shadow Jul 26 '21 at 07:41
  • That wouldn't matter, would it? Because If i write ```print(response.url)``` I get the url i want – Mohamad Ahmed Jul 26 '21 at 07:48
  • I'm sorry, but I do not understand your comment! What url are you talking about? – Shadow Jul 26 '21 at 07:51
  • this url: https://de-legalization.tlscontact.com/eg/CAI/myapp.php?fg_id=29812 which is the one I want the html of – Mohamad Ahmed Jul 26 '21 at 08:12
  • You are already getting its content, which is a javascript redirect. If this is not what you are expecting, you need understand how to login to the website or retrive the authentication data out of your web browser (session and other cookies). – Shadow Jul 26 '21 at 08:18

1 Answers1

0

You see this particular output because it is in fact the content of the page you are downloading.

You can test it in chrome by opening the following url:

view-source:https://de-legalization.tlscontact.com/eg/CAI/myapp.php

This is how it looks like in Chrome:

enter image description here

This is happening because you are being redirected by the javascript code on the page.

Since the page you are trying to access requires login, you cannot access it just by sending http request to the internal page.

You either need to extract all the cookies and add them to the python script. Or you need to use a tool like Selenium that allows you to control a browser from your Python code.

Here you can find how to extract all the cookies from the browser session:

How to copy cookies in Google Chrome?

Here you can find how to add cookies to the http request in Python:

import requests

cookies = {'enwiki_session': '17ab96bd8ffbe8ca58a78657a918558'}

r = requests.post('http://wikipedia.org', cookies=cookies)
fshabashev
  • 619
  • 6
  • 20
  • Even when I use the cookies, pretty much the same error. ```r = requests.post('https://de-legalization.tlscontact.com/eg/CAI/myapp.php', cookies=cookies) print(r.text)``` I get `````` I also can't use selenium because I am checking for a change continuously so i don't think selenium is efficient. If there is any way to check for a change in the website, tell me. Thanks – Mohamad Ahmed Jul 26 '21 at 08:07
  • 1
    @MohamadAhmed selenium was designed for automating interactions with a web page. There is no general rule how to interact with a website that requires login. You need to tailor your script to the requirements of the specific page. – Shadow Jul 26 '21 at 08:20