Getting form "action" from BeautifulSoup result

Question

I'm coding a Python parser for a website to do some job automatically but I'm not much into "re" module (regex) for Py and can't make it work.

req = urllib2.Request(tl2)
req.add_unredirected_header('User-Agent', ua)
response = urllib2.urlopen(req)
try:
    html = response.read()
except urllib2.URLError, e:
    print "Error while reading data. Are you connected to the interwebz?!", e

soup = BeautifulSoup.BeautifulSoup(html)
form = soup.find('form', id='form_product_page')
pret = form.prettify()

print pret

Result:

<form id="form_product_page" name="form_1362737440" action="/download/791055/164084/" method="get">
<input id="nojssubmit" type="submit" value="Download" />
</form>

Indeed that code is done, just what I need for start. Now, I'm wondering on which way should I extract "action" attribute from "form" tag. That is only what I need from BeautifulSoup response.

I've tried using form = soup.find('form', id='form_product_page').parent.get('action') but result was 'None'. What I want to extract is for example "/download/791055/164084/". This is different on every URL from link.

Variables (example):
tl2 = http://example.com
ua = Mozilla Firefox / 14.04

score 12 · Accepted Answer · answered May 04 '14 at 23:53

12

You can do it in one step:

action = soup.find('form', id='form_product_page').get('action')

answered May 04 '14 at 23:53

Casimir et Hippolyte

88,009
5
94
125

Whoops, seems that I'll have to read BS documentation a bit more. That is exactly what I need. Thanks! Answer accepted. – sensation May 04 '14 at 23:55
In my case, if the `action` contains arguments, this doesn't work, i.e. `action="https://site.tld/file?arg1=test"` , `.get('action')` will only retrieve `https://site.tld/file` – Pedro Lobito Apr 25 '20 at 17:39

Getting form "action" from BeautifulSoup result

1 Answers1

Linked