1

I have been trying to get through a request where the first page is a mathematical calculus to pass to the main page. This part is solved. However when I try to obtain something else i get the following:

<script>
window.location.reload();
</script>

I have learned this method in a while but only now I am trying it for the first time:

import re
import requests


def login_tokyo(s):
    r = s.get('https://apcis.tmou.org/public/')
    str_number = re.findall("<span[^>]+(.*?)</span>", r.text)[0]
    numbers = re.findall('[0-9]+', str_number)
    captcha = int(numbers[0]) + int(numbers[1])
    payload = {'captcha': captcha}
    r = s.post('https://apcis.tmou.org/public/?action=login', data=payload)
    check_text = re.findall('<b>(.*?)</b>', r.text)[0]
    print(check_text)
    payload1 = {'Param': 0, 'Value': 5797164, 'imo': '', 'callsign': '', 'name': '', 'compimo': 5797164,
                'compname': '', 'From': '01.06.2020', 'Till': '31.08.2020', 'authority': 0, 'flag': 0, 'class': 0,
                'ro': 0, 'type': 0, 'result': 0, 'insptype': -1, 'sort1': 0, 'sort2': 'DESC', 'sort3': 0,
                'sort4': 'DESC'
                }
    r = s.post('https://apcis.tmou.org/public/?action=getcompanies', data=payload1)
    perf_tm = re.findall("<p class=[^>]+(.*?)</p>", r.text)
    print(r.text)
    print(perf_tm)

if __name__ == '__main__':
    with requests.Session() as s:
        login_tokyo(s)

The print(check_text) tells me that I am on the main page but then... nothing. From this specific request, I expected the print(perf_tm) to get me Medium. Appreciate all help!

logi-kal
  • 7,107
  • 6
  • 31
  • 43
Eduardo
  • 121
  • 11

1 Answers1

2

EDIT:

Nevermind, I was wrong, session should be handling all cookies, seems like the website rejects requests from borwsers without a User-Agent, just do this:


def login_tokyo(s):
    header={'User-Agent':''}
    s.headers.update(header)
    r = s.get('https://apcis.tmou.org/public/')
    str_number = re.findall("<span[^>]+(.*?)</span>", r.text)[0]
    ...

Orignal Answer:

You are not handling PHPSESSID cookie, many websites use it track logins serverside, try doing this


def login_tokyo(s):
    header={'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.','Cookies':'PHPSESSID=xxxxxxxxxxxxxxxxxxxxxxxxxxxx'}
    s.headers.update(header)
    r = s.get('https://apcis.tmou.org/public/')
    str_number = re.findall("<span[^>]+(.*?)</span>", r.text)[0]
    ...

You can also use cookielib (more info in this question), if you do not want to manually handle cookies, though most servers do not care if the sessionid set is already in their database, and would accept any random sessionid.

Anunay
  • 397
  • 2
  • 10
  • This worked like a charm. Thank you for identifying the issue. I will certainly learn more about it to understand it better. – Eduardo Sep 18 '20 at 12:32