https://stackoverflow.com/a/64983/468251 - Hello, I have question about this code, how made that working with remote website url, and how got value = fooId['value'] from all inputs, no only from first?
Asked
Active
Viewed 2,021 times
1
-
You can post your request for information on that answer. Don't post a new question here. Add a comment to the existing answer. – S.Lott Jan 12 '12 at 15:26
2 Answers
2
When you parse url on the internet, you need to find a way to download the page content html first. There are great libraries, like requests, which is said to be best for python. Say you want to parse https://stackoverflow.com/
import requests
response = requests.get("https://stackoverflow.com/")
page_html = response.text
The page_html is the page html in python string, then you can treat it like a local html file, and preform any kind of parsing on them.
As for getting all the occurrence of a pattern, you can do soup.findAll('input',name='fooId',type='hidden')
, instead of just soup.find()
. The soup.findAll will return a list of all occurrence.

Shawn
- 571
- 7
- 8
1
The example use a local file. If you want to use a remote site, you need to download the file from the server and parse the html.
You can look at request or urllib2 for this.
I hope it helps

luc
- 41,928
- 25
- 127
- 172
-
import urllib2 urllib2.urlopen('http://...').read() work, but how take elements from soup.findAll (there is example with soup.find)?:) – Rambo Jan 12 '12 at 15:37
-
from doc: The find method is almost exactly like findAll, except that instead of finding all the matching objects, it only finds the first one. – luc Jan 12 '12 at 15:52