1

https://stackoverflow.com/a/64983/468251 - Hello, I have question about this code, how made that working with remote website url, and how got value = fooId['value'] from all inputs, no only from first?

Community
  • 1
  • 1
Rambo
  • 71
  • 2
  • 6
  • You can post your request for information on that answer. Don't post a new question here. Add a comment to the existing answer. – S.Lott Jan 12 '12 at 15:26

2 Answers2

2

When you parse url on the internet, you need to find a way to download the page content html first. There are great libraries, like requests, which is said to be best for python. Say you want to parse https://stackoverflow.com/

import requests
response = requests.get("https://stackoverflow.com/")
page_html = response.text

The page_html is the page html in python string, then you can treat it like a local html file, and preform any kind of parsing on them.

As for getting all the occurrence of a pattern, you can do soup.findAll('input',name='fooId',type='hidden'), instead of just soup.find(). The soup.findAll will return a list of all occurrence.

Shawn
  • 571
  • 7
  • 8
1

The example use a local file. If you want to use a remote site, you need to download the file from the server and parse the html.

You can look at request or urllib2 for this.

I hope it helps

luc
  • 41,928
  • 25
  • 127
  • 172
  • import urllib2 urllib2.urlopen('http://...').read() work, but how take elements from soup.findAll (there is example with soup.find)?:) – Rambo Jan 12 '12 at 15:37
  • from doc: The find method is almost exactly like findAll, except that instead of finding all the matching objects, it only finds the first one. – luc Jan 12 '12 at 15:52