0

I am trying to scrape all recorded event's table from the web-site http://southasiaterrorism.trfetzer.com/districts/17497-IND-Nandurbar.html. I am using scrapy spider for it, but it's not possible to get that table as it's loaded dynamically. I was trying to use selenium, but no result, I got the same static html page without the table loaded. Any help would be greatly appreciated.

  • 1
    No, its not loaded dynamically, just check the page source inside `script` tag there is a list of all those table elements, just extract that. No need of selenium for this – Stack Oct 25 '17 at 17:55
  • but I don't see why I earn negative sign, maybe for someone it's simple, but I am newbie in all this things. – Sirak Ghazaryan Oct 25 '17 at 19:00
  • It doesnt matter, just keep learning : ) @Sirak Ghazaryan – Stack Oct 26 '17 at 11:11

1 Answers1

0

As mentioned by @Stack, the content is not loaded dynamically, it's in the page inside the <script> tags. You can try something like this:

page = urllib2.urlopen(url).read()
soup = BeautifulSoup(page)
for tr in soup.find_all('tr')[2:]:
    tds = tr.find_all('td')
    print (tds)

From this question.

Note: this code is untested.

jdoe
  • 634
  • 5
  • 19