1

I am currently new with HTTP parsing, I am using python to send and Receive requests in HTTP. I just have a small problem as the website I am dealing with in sending requests does not just need Headers and POST. As when I click a button on the webpage there is a JavaScript code executed that tells the server to respond for my upcoming Request.

So if I normally open the page with same headers and POST request it will just open it as a normal GET and will not read any of the data I supplied in the POST.

My code :

import cookielib
import urllib
import urllib2




# Store the cookies and create an opener that will hold them
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))

# Add our headers
opener.addheaders = [('User-agent', 'RedditTesting'),
                     ('Cookie', '')
                     ]


urllib2.install_opener(opener)

# The action/ target from the form
authentication_url = 'http://plapal.pla/Search.aspx'

# Input parameters we are going to send
payload = {
  '_EVENTTARGET': 'btnSearch',
  '_VIEWSTATE': 'plapla',
  'ctl04%24ddNavigate': 'plapla',
  'chkDate': 'on',
  '_EVENTARGUMENT': '',
  '_LASTFOCUS': '',
  'txtResvCode': '',
  'txtCustName': '',
  'txtFromDate': '27%2F01%2F2013',
  'txtToDate': '27%2F09%2F2013',
  'ddSearchType': '1',
  'ddChannel': '-1',
  'ddNetGross': 'NET'
  }

# Use urllib to encode the payload
data = urllib.urlencode(payload)

# Build our Request object (supplying 'data' makes it a POST)
req = urllib2.Request(authentication_url, data)

# Make the request and read the response
resp = urllib2.urlopen(req)
contents = resp.read()

print contents

but it doesn't work. and in the webpage when I hover the Search button I get :

javascript:WebForm_DoPostBackWithOptions(new%20WebForm_PostBackOptions("btnSearch",%20"",%20true,%20"",%20"",%20false,%20true))

So how to execute this JS so I can actually Enter my Post Data.

Skyliquid
  • 374
  • 1
  • 5
  • 23

1 Answers1

3

Using JavaScript with a headless browser

If you need to execute JavaScript, it will be easier if you have a JavaScript engine available.

Instead of Python, I would consider using a headless browser such as PhantomJS. Then you can scrape the page and execute any JavaScript you need to, either running your own JS or code from the page itself.

The examples page for PhantomJS has several examples in the Page Automation section that may be similar to what you need.

It looks like you want to run an existing function in the page: WebForm_DoPostBackWithOptions(). So I would take a look at the injectme.js example which injects a script into the page. That script could then in turn call any function in the page you want to.

Or depending on what you're doing, there may be an even simpler way to do it with PhantomJS. They have a lot of good examples and docs to look through.

Using Python and a JavaScript debugger

Of course, you may not actually need to directly execute the WebForm_DoPostBackWithOptions() function. From the name, that sounds like it's likely to be a fairly simple JavaScript function. Have you looked at its code and traced through it while interacting with the page manually in a web browser? Tracing through the code should make it easy to see what the function really does. (If you do that in the Chrome developer tools and find that the code is unreadable because it's been "minified", use the {} button to pretty-print it.)

Or to cut to the chase, the Network tab or equivalent in any of the browser debuggers should let you see exactly what POST request is generated by that function. Then you can do the same in your Python code.

If you're not familiar with the developer tools in the current browsers, you are in for a treat: they are really good these days. I like the one built into Chrome, but Firebug (for Firefox) and the Internet Explorer tools are also excellent.

Michael Geary
  • 28,450
  • 9
  • 65
  • 75
  • Thank you for your Post, I now know some stuff that I was not aware of, unfortunately I need to use Python. I don't care if there will be other libraries I have to add to python in order to do that. I need a pure .py file that can automate a job to be done on webpage as its a part of a project, I cant use other tools at same time. I already traced the js code and I see what it do. As I can see, the only way I can do is to execute JS code so I can Send Post on the upcoming request. – Skyliquid Sep 29 '13 at 22:27
  • So let's take it step by step. Can you use the developer tools in the browser to find out what the actual POST request is? If so, you should be able to duplicate that request in Python. – Michael Geary Sep 30 '13 at 01:22
  • 2
    You can use PhantomJS from the python Selenium driver. Just install phantom JS and use webdriver.PhantomJS: http://selenium-python.readthedocs.org/en/latest/api.html – Lucas Wiman Sep 30 '13 at 04:13
  • Okey so I installed Selenium driver, how to configure path of phantomJS to Selenium on Windows OS, also what should I do next ? Can someone give me an Example of executing JS code through Phantom using Python Selenium. – Skyliquid Sep 30 '13 at 10:52