0

Some web pages, having their urls, have "Download" Text, which are hyperlinks.

How can I get the hyperlinks form the urls/pages by python or ironpython.

And can I download the files with these hyperlinks by python or ironpython? How can I do that?

Are there any C# tools?

I am not native english speaker, so sorry for my english.

Begtostudy
  • 1,374
  • 4
  • 13
  • 28

2 Answers2

2

You should be able to use the BeautifulSoup library with CPython (normal Python) and IronPython. Check out the findAll() method. This should pull out a list of all the links.

soup.findAll('a')
Brian Lyttle
  • 14,558
  • 15
  • 68
  • 104
  • 1
    Beautiful Soup 中文文档 http://www.crummy.com/software/BeautifulSoup/documentation.zh.html – jcao219 Jul 16 '10 at 01:06
1

The easiest way would be to pass the HTML page into an XML/HTML parser, and then call getElementsByTagName("A") on the root node. Once you get that, iterate through the list and pull out the href parameter.

bluesmoon
  • 3,918
  • 3
  • 25
  • 30