In addition to the ones alecxe mentioned, another alternative is to use a GUI browser tool such as Firefox's Web Console to
inspect the POST that is made when you click the submit button. Sometimes you
can find the POST data and simply spoof it. For example, using the example
url you posted, if you
- Use Firefox to go to http://apply.ovoenergycareers.co.uk/vacancies/#results
- Click Tools > Web Developer > Web Console
- Click Net > Log Request and Response Bodies
- Fill in the form, click Search
- Left-click the (first) POST in the Web Console
- Right-click the (first) POST, select COPY POST Data
- Paste the POST data in a text editor
you will obtain something like
all
field_36[]=73
field_37[]=76
field_32[]=82
submit=Search
(Note that the Web Console menus vary a bit depending on your version of Firefox, so YMMV.) Then you can spoof the POST using code such as:
import urllib2
import urllib
import lxml.html as LH
url = "http://apply.ovoenergycareers.co.uk/vacancies/#results"
params = urllib.urlencode([('field_36[]', 73), ('field_37[]', 76), ('field_32[]', 82)])
response = urllib2.urlopen(url, params)
content = response.read()
root = LH.fromstring(content)
print('\n'.join([tag.text_content() for tag in root.xpath('//dl')]))
which yields
Regulatory Data Analyst
Contract Type
Permanent
Contract Hours
Full-time
Location
Bristol
Department
Business Intelligence
Full description
If you inspect the HTML and search for field_36[]
you'll find
<div class="multiwrapper">
<p class="sidenote multi">(Hold the ctrl (pc) or cmd (Mac) keys for multi-selects) </p>
<select class="select-long" multiple size="5" name="field_36[]" id="field_36"><option value="0">- select all -</option>
<option selected value="73" title="Permanent">Permanent</option>
<option value="74" title="Temporary">Temporary</option>
<option value="75" title="Fixed-term">Fixed-term</option>
<option value="81" title="Intern">Intern</option></select>
</div>
from which it is easy to surmise that field_36[]
controls the Contract Type
and value 73
corresponds to "Permanent", 74
corresponds to "Temporary", etc. Similarly you can figure out the options for field_37[]
, field_32[]
and all
(which can be any search term string). If you have a good understanding of HTML, you may not even need the browser tool to construct the POST.