1
def get_Auth():

    USERNAME = User.get("1.0", END)
    PASSWORD = Pass.get("1.0", END)
    print(USERNAME)
    print(PASSWORD)

    url = 'https://ps.lphs.net/public/home.html'

    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.92 Safari/537.36 Vivaldi/1.6.689.34'}

    g = requests.get(url)

    soup = BeautifulSoup(g.content)

    'Find The Values'

    PSTOKEN = None
    CONTEXTDATA = None

    for input in soup.find_all('input')[0:1]:
        PSTOKEN = input.get('value')

        print(PSTOKEN)

    for input in soup.find_all('input')[1:2]:
        CONTEXTDATA = input.get('value')

        print(CONTEXTDATA)


    payload = {
              'pstoken': PSTOKEN,
              'contextData': CONTEXTDATA,
              'dbpw': '',
              'translator_username': '',
              'translator_password': '',
              'translator_ldappassword': '',
              'returnUrl': 'https://ps.lphs.net/guardian/home.html',
              'serviceName': 'PS Parent Portal',
              'serviceTicket': '',
              'pcasServerUrl': '\ /',
              'credentialType': 'User Id and Password Credential',
              'account': USERNAME,
              'pw': PASSWORD,
              'translatorpw': ''
              }

    r = requests.post(soup, data=payload)
    print(r)

I am trying to log on to PowerSchool and scrape my grades from a login required page. I have been watching video after video and cannot figure out why it won't work. I have a Tkinter window that asks for my username and password then uses that to log onto that website. But when I run it all I get is the login page source code. Here are the pictures of the Network tab under inspect element.

Request Headers / Form Data

I'm not sure what's wrong here, I've been looking into this for a while now. Thanks in advance!

Sam
  • 47
  • 1
  • 1
  • 3
  • web pages are more complicated then you think. First: every `post()` may need different values in `pstoken`, `contexDat`, `account` and `pw`. You have to `get()` page with login form and find correct values in HTML. Second: server may check other elements like cookies (so better use `requests.Session()`) or headers (mostly `User-Agent`). – furas Jan 02 '17 at 04:18
  • and it can use JavaScript to hash/calculate some data to control form - ie. it can send hashed password in `pw` field. – furas Jan 02 '17 at 04:28

1 Answers1

2

I don't have an account to test, but multiple things are wrong in your current approach:

  • the password (pw field) is hashed via the following function (defined here):

    function doPCASLogin(form)
    {
       var originalpw = form.pw.value;
       var b64pw = b64_md5(originalpw);
       var hmac_md5pw = hex_hmac_md5(pskey, b64pw)
       form.pw.value = hmac_md5pw;
       form.dbpw.value = hex_hmac_md5(pskey, originalpw.toLowerCase())
       if (form.ldappassword!=null) {
         // LDAP is enabled, so send the clear-text password
         // Customers should have SSL enabled if they are using LDAP
         form.ldappassword.value = originalpw; // Send the unmangled password
       }
    
       // Translator Login
       var translatorpw = form.translatorpw.value;
       var i = translatorpw.indexOf(";");
        if (i < 0) {
            form.translator_username.value = translatorpw;
            form.translator_password.value = "";
        }
        else {
            form.translator_username.value = translatorpw.substring(0,i);
            translatorpw = translatorpw.substring(i+1); // Get the password
            translatorpw2 = translatorpw;
            translatorpw = b64_md5(translatorpw);                   // Added in move to pcas
            form.translator_password.value = hex_hmac_md5(pskey, translatorpw);
            if (form.translator_ldappassword!=null) {
                // LDAP is enabled, so send the clear-text password
                // Customers should have SSL enabled if they are using LDAP
                form.translator_ldappassword.value = translatorpw2; // Send the pw for LDAP
            }
        }
    
        return true;
    }
    
  • you cannot have the same token values every time you make a request. You have to get the token values from the actual form. Which means that you need to first "GET" the home.html, extract the token values and then use them in your "POST" request.

For the second problem, you might want to try things like mechanize or mechanicalSoup that would "auto-populate" the rest of the form fields automatically. They, though, cannot execute JavaScript which is quite important in this particular case.

If you want to avoid dealing with all these problems, look into browser automation and selenium package.

alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • Thanks for the reply, I took some time to fix some of the code and I was able to extract the pstoken and the contextdata. I was looking and I noticed that the pstoken and the contextdata were the only values that changed each post. So I got them and put them in values and used them in the payload. I still have to figure out the first problem with the hashed pw. But I wanted to know if this is the correct way of doing it, and a step in the right direction. The updated code is above, Thanks. – Sam Jan 04 '17 at 21:55