1

My scenario is: I'm running a web automation using Selenium WebDriver to obtain data and navigate dinamicly in a website. Sometimes, a javascript file that contains a not-relevant-code take more than 1 minute to load, and it make all my code runs slow - Because when Selenium WebDriver loads an page, it waits all files be loaded.

Since i haven't access to change or modify the page source code, I didn't found a solution to that. The nearby workaround I noticed that could be useful is to apply a extension inside my ChromeDriver to do it (like AdBlocker).

Also, what i did so far with AdBlock extension:

ChromeOptions option = new ChromeOptions();
option.AddExtension("/adblock.crx");
Driver = new ChromeDriver(option);
// Here i need to block the file manually when Chrome window open 
// (it is also not a problem)

Driver.Manage().Window.Maximize();
Driver.Navigate().GoToUrl(myUrl);
// Here, any url with a js file that I don't want to download

There is no mention to a method or function in Selenium WebDriver documentation or Capabilities list that is able to ignore or block a specified file for loading, like AdBlock or another similar extension. So, I would like know if is possible to do that without using external extensions.

Striter Alfa
  • 1,577
  • 1
  • 14
  • 31
  • Is it better now? – Striter Alfa Jan 22 '18 at 16:27
  • try a blackhole proxy – Corey Goldberg Jan 22 '18 at 19:08
  • Possible duplicate of [How to make Selenium not wait till full page load, which has a slow script?](https://stackoverflow.com/questions/44770796/how-to-make-selenium-not-wait-till-full-page-load-which-has-a-slow-script) – undetected Selenium Jan 22 '18 at 19:39
  • You can have a look at this QA-https://stackoverflow.com/questions/43734797/page-load-strategy-for-chrome-driver/43737358#43737358 – undetected Selenium Jan 22 '18 at 19:53
  • @DebanjanB Your link about `pageLoadStrategy` probably fits what I need (I also upvoted the answer). I am going to adapt my code to implement it (change all waits events, for example), and, if it works, i will close my question. If not, i update it – Striter Alfa Jan 22 '18 at 21:12
  • @CoreyGoldberg I also will read more about blackhole proxy pattern, it looks another good solution and maybe can solve the problem – Striter Alfa Jan 22 '18 at 21:16

1 Answers1

1

Simple answer, no

WebDriver is designed to emulate a browser with default settings. Since, by default, a browser will load everything asked of it, running all JavaScript and rendering all CSS.

There is one possible option, depending on where the JavaScript is located. If the JS file is on a unique server (not the one hosting the site you're trying to scrape), you can edit your hosts file on your computer to null out attempts to reach that server.

You can find more/better documentation elsewhere but the gist is to add a line to your hosts file like so:

problem_server.com    127.0.0.1

This will not work if:

  • The .js file is on the same server as the rest of the site
  • The .js file is on the same server as other files you need to work properly

If that is the case, you need to stick with something more granular like AdBlock.

Community
  • 1
  • 1
MivaScott
  • 1,763
  • 1
  • 12
  • 30