Logging with python into a subpage (from another post)

Question

I came across this question: How to use Python to login to a webpage and retrieve cookies for later usage?

So, I'm trying to log into a page, (using the request method, second answer).

When I print the HTML using

    print request.text

It will print the HTML of the login page, but not the subpage that I put on request.

Is there a problem with the code (which I don't think) or is it mine?

The code is similar to the one on that question, with different pages and usernames.

Thanks!

from requests import session

USERNAME = 'myuser'
PASSWORD = 'mypwd'

payload = {
    'action': 'login',
    'username': USERNAME,
    'password': PASSWORD
}

with session() as c:
    c.post('https://www.bricklink.com/login.asp', data=payload) #Login page
    request = c.get('http://www.bricklink.com/orderExcelFinal.asp?') #Page I want to access
    print request.headers
    print request.text

Output

HTML code for the Login page, but not the page I want to access

It would be helpful if you showed us your code (you can make up pages and usernames). — Gerrat, Dec 17 '13 at 17:51
I didn't post it as it is exactly the same as the one on the second answer on the other question. I'm just testing it out. However, I will re-post it. thanks! — Brick Top, Dec 17 '13 at 17:52
It's possible your 2nd request may be an invalid page, and it's just sending you back to the login page. The 2nd link looks slightly fishy...usually when a url ends in a question mark, it has arguments after it. If you were to log in by hand, and then request this page (exactly), does it come up? — Gerrat, Dec 17 '13 at 18:01
Indeed, but that's another question I need to ask after this one. The page has it's arguments, but you select the arguments through checkboxes on a form. It shows up like that (with only ? after the asp) no matter which arguments. I want to find a way to input the parameters into the asp. I'd think about inputting them after the ?, but it doesn't work this way here. Anyhow, cheers! — Brick Top, Dec 17 '13 at 19:10

score 2 · Accepted Answer · answered Dec 17 '13 at 18:43

Your code isn't sending the correct data on the login request.

Each web page is different, and sends different data in order to log in. Yours should be structured like this:

from requests import session

USERNAME = 'myuser'
PASSWORD = 'mypwd'

query = {
    'logInTo': '',
    'logFolder': 'p',
    'logSub': 'w',
}

payload = {
    'a': 'a',
    'logFrmFlag': 'Y',
    'frmUsername': USERNAME,
    'frmPassword': PASSWORD,
}

with session() as c:
    c.post('https://www.bricklink.com/login.asp', params=query, data=payload) #Login page
    request = c.get('http://www.bricklink.com/orderExcelFinal.asp') #Page I want to access
    print request.headers
    print request.text

In the future, when you need to work out what data needs to be send on an attempt to submit a form, you should use Chrome or Firefox's Developer Tools. Use these to record your login attempt, and then structure the data accordingly. Getting started with using Chrome's developer tools is a bit beyond the scope of this answer, but there are lots of good resources on the web for finding out how you get this information.

Excellent answer! Specially thanks for the last tip about Chrome's dev tools, I didn't know where to find things like this. Cheers! — Brick Top, Dec 17 '13 at 19:11

Logging with python into a subpage (from another post)

1 Answers1