1

I need to gather PDF-files from this page: http://www.anp.gov.br/?id=532.

I wonder how this is possible in Python when I cant find the links in the HTML source code. Before I have found the links to such files by using Beautifulsoup and pandas.

Thanks for all kind of answers!

  • Can you explain why you can't find the links in the HTML source code? I'm not sure I'm clear on the goal here. – Alex W Jul 07 '15 at 17:15
  • Hi, Alex W! The developers that made the page have not written the links directly in the HTML source code, but are called when clicked. I want these links to collect all the data, and merge them into one excel sheet. Thanks for the respond btw! – Mathias Lia Carlsen Jul 07 '15 at 17:18

1 Answers1

4

It looks like all of the pdf links are in <a> tags so you can use BeautifulSoup to grab those links. If you need further advice I recommend you reference this discussion to see how to accomplish that task.

enter image description here

Community
  • 1
  • 1
gffbss
  • 1,621
  • 1
  • 17
  • 19