1

I'm trying to use Scrapy to parse a relatively simple set of webpages. The main page has a bunch of links that look like:

<a name='LINK1$17' id='LINK1$17' tabindex='145' href="javascript:hAction_win0(document.win0,'LINK1$17', 0, 0, 'International Relations', false, true);"  class='SSSAZLINK'>International Relations</a>

Clicking that link loads up the second page on which some of the details I'm scraping appear. I do need to start on that first page because it serves as an index of all these things I'm scraping. How do I use selenium to run that javascript action? I've tried:

import webdriver
driver = webdriver.Firefox()
driver.execute_script("javascript:hAction_win0(document.win0,'LINK1$17', 0, 0, 'International Relations', false, true);")

That did not work. Is there an easy way to "click" the link and get what appears?

Leo Mizuhara
  • 365
  • 2
  • 3
  • 15
  • 1
    You're trying to use selenium here for just clicking the link, right? Then, if the second page is loaded by ajax XHR request - take a look at [this thread](http://stackoverflow.com/questions/8550114/can-scrapy-be-used-to-scrape-dynamic-content-from-websites-that-are-using-ajax?lq=1). – alecxe Apr 22 '13 at 07:09
  • 1
    So, basically you should use browser developer tools to see what request is going to the server when you're clicking the link. Then, you should simulate it in your crawler with the help of Scrapy's [Request](http://doc.scrapy.org/en/latest/topics/request-response.html). – alecxe Apr 22 '13 at 07:11

1 Answers1

0

Turns out I was using the right function. The following call works:

driver.execute_script("hAction_win0(document.win0,'LINK1$17', 0, 0, 'International Relations', false, true);")

I just had to remove the "javascript:" at the beginning.

Leo Mizuhara
  • 365
  • 2
  • 3
  • 15