4

I want to disable JavaScript while scraping using scrapy and selenium. Moto of doing that is to increase scraping speed. I found the preference for Firefox driver but not PhantomJS.

firefox_profile = webdriver.FirefoxProfile()
firefox_profile.set_preference("javascript.enabled", False)

driver = webdriver.Firefox(firefox_profile=firefox_profile)
driver.get('http://www.quora.com/')

How can this be done for PhantomJS webdriver?

Artjom B.
  • 61,146
  • 24
  • 125
  • 222
aman
  • 1,875
  • 4
  • 18
  • 27

2 Answers2

8

The WebDriver protocol in PhantomJS is a pure JavaScript implementation that is known as Ghostdriver. It makes heavy use of page.evaluate() to access the DOM and there is really no other way to access the DOM, interact with the page or do anything meaningful with PhantomJS. You shouldn't do this.

If you still want to go through with it, this should work:

cap = webdriver.DesiredCapabilities.PHANTOMJS
cap["phantomjs.page.settings.javascriptEnabled"] = False
driver = webdriver.PhantomJS(desired_capabilities=cap)
Artjom B.
  • 61,146
  • 24
  • 125
  • 222
  • It looks like you're passing `desired_capabilities` into the driver's constructor. What if you want javascript on for some pages and off for others? How can you set `desired_capabilities` between calls? – speedplane Oct 14 '16 at 12:53
  • @speedplane Here's a hint: you can change it through [PhantomJS' API of `page.settings.javascriptEnabled`](http://phantomjs.org/api/webpage/property/settings.html). For that, you need to be able to execute scripts in the context of PhantomJS and not in the context of the page (see my answer [here](http://stackoverflow.com/a/32192382/1816580)). – Artjom B. Oct 16 '16 at 14:29
0

If the site does not require JavaScript, just use scrapy alone. There is no need for selenium. Scrapy is extremely fast for non JavaScript pages.

Eric Valente
  • 439
  • 7
  • 14
  • I need selenium webdriver for retrieving CSS-style elements. Can you suggest other tools in place of selenium. I cant get CSS properties by just using scrapy. I dont think i require JavaScript. – aman Aug 20 '15 at 18:00