0

I have been working on this for couple days now. Can not really find how to make this work. I am fairly new to aspx websites and fetching information out of them.

I am trying to login/authenticate on a website that uses aspx pages. So I followed this thread which really helped me get this in motion. (Last Answer)

Following those directions, I write:

url = "http://samplewebsite/Main/Index.aspx" # Logon page
username = "user"
password = "password"

browser = RoboBrowser(history=True)
# This retrieves __VIEWSTATE and friends
browser.open(url)
signin = browser.get_form(id='form1')
print(signin)

This is the outcome of that print statement:

<RoboForm __VIEWSTATE=/wEPDwULLTE5ODM2NTU1MzJkGAEFHl9fQ29udHJvbHNSZXF1aXJlUG9zdEJhY2tLZXlfXxYBBQlidG5TdWJtaXRriD1xvrfrHuJ/0xbQM08yEjyoUg==, __VIEWSTATEGENERATOR=E78488FE, adminid=, btnSubmit=, pswd=>

So it is obvious that I am retrieving the information correctly. Now I have 3 input fields:

adminid
btnSubmit
pswd

Which I can use in the following manner:

signin["adminid"].value = username
signin["pswd"].value = password
signin["btnSubmit"].value = "btnSubmit.x=29&btnSubmit.y=22"

My only problem is the last field btnSubmit which I do not know how to input a value since this is of the following type:

<input type="image" name="btnSubmit" id="btnSubmit" tabindex="3" src="../image/login_btn.gif" style="height:41px;width:57px;border-width:0px;" />

when I submit on the website, using the Chrome Tools I get the following outcome:

    __VIEWSTATE:/wEPDwULLTE5ODM2NTU1MzJkGAEFHl9fQ29udHJvbHNSZXF1aXJlUG9zdEJhY2tLZXlfXxYBBQlidG5TdWJtaXRriD1xvrfrHuJ/0xbQM08yEjyoUg==
__VIEWSTATEGENERATOR:E78488FE
adminid:user
btnSubmit.x:23
btnSubmit.y:15
pswd:password

Where basically the x,y positions are where I clicked on the page. Really do not know how to do this request through Python. Used this to no avail.

Alexander
  • 339
  • 3
  • 13

1 Answers1

1

When you click on an input object of type image, two form values are set, the button name plus .x for the first, and .y for the other.

However, pressing Enter in a regular text input field will also submit a form, so you don't have to click on a submit button. I'd just leave the value empty altogether.

There is not much flexibility in the way robobrowser handles form submits, to avoid using the submit button you'd have to delete it from the form outright:

del signin.fields['btnSubmit']

before submitting.

If you must submit using the image button, then you'll have to teach Robobrowser how to handle that type; currently it has no handling for these. The following adds that:

from functools import wraps
from robobrowser.forms import form
from robobrowser.forms.fields import Submit, Input

class ImageSubmit(Submit):
    def serialize(self):
        return {self.name + '.x': '0', self.name + '.y': '0'}

def include_image_submit(parse_field):
    @wraps(parse_field)
    def wrapper(tag, tags):
        field = parse_field(tag, tags)
        if type(field) is Input:  # not a subclass, exactly this class
            if field._parsed.get('type') == 'image':
                field = ImageSubmit(field._parsed)
        return field
    return wrapper

form._parse_field = include_image_submit(form._parse_field)

at which point you can use browser.submit_form(signin, signin['btnSubmit']) to submit the form and the correct fields will be included.

I've submitted a pull request to the robobrowser project to add image submit support.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • @Alexander: I don't know, does your browser end up on a different URL after submission? You didn't really give us more to work with to help you test. – Martijn Pieters Mar 09 '18 at 19:15
  • Okay, So I have done everything above. On the web browser, I start at this page: ```http://pagel.com/Login/VerifyAdmLogin.aspx?url=http%3a%2f%2fpage.com%2fMain%2fIndex.aspx``` and once I login I end up at ```http://page.com/Main/Index.aspx```. But I do no know how to check this within Python. – Alexander Mar 09 '18 at 19:23
  • @Alexander: so what is `browser.url` after you submit the form? – Martijn Pieters Mar 09 '18 at 19:24
  • The ```browser.url``` is the the initial one: ```http://page.com/Login/VerifyAdmLogin.aspx?url=http%3a%2f%2fpage.com%2fMain%2fIndex.aspx``` which tells me that something did not go right and it did not login. Is that right? – Alexander Mar 09 '18 at 20:58
  • @Alexander: I don't think it logged in in that case. Was that with or without the image submit? I suspect that ASP forms may be using some Javascript to add something to the form. Perhaps you need to switch to heavier artillery here and use [`requests-html`](https://github.com/kennethreitz/requests-html) to run the HTML of a page through a real browser. Note that that project doesn't have such nice form handling, however. – Martijn Pieters Mar 09 '18 at 21:27
  • Indeed, after using the [code snippe](https://imgur.com/px7dK47)t as you gave me, it still redirects me to the login page. I checked the requests on Chrome Tools, do not see any JS input. Both JS that show up after pressing login, are GET methods, and none related to any form input. I will now read about the ```requests-html``` to see if I can make this happen – Alexander Mar 09 '18 at 22:21
  • @Alexander you didn’t apply my supplied solution correctly. You didn’t use the `include_image_submit` decorator until after the firm has been parced, and it is the crucial link to have the new input type class I defined be actually used. Use it before creating the browser object. – Martijn Pieters Mar 09 '18 at 23:15
  • it Worked!! I have no idea what this code does, I will have to go read about it to fully understand it. I appreciate all your help. – Alexander Mar 09 '18 at 23:26