I am using selenium java to do web page scraping, basically the app creates a WebDriver and use it all the times for all pages required(every 1 or 2 seconds it will do a get() call for a new page and extract the related content).
I am using Firefox headless mode like this:
String driverPath = this.config.getString("browser.firefox.driverPath");
FirefoxBinary firefoxBinary = new FirefoxBinary();
if (useHeadlessMode) {
firefoxBinary.addCommandLineOptions("--headless");
}
System.setProperty("webdriver.gecko.driver", driverPath);
FirefoxOptions firefoxOptions = new FirefoxOptions();
firefoxOptions.setBinary(firefoxBinary);
webDriver = new FirefoxDriver(firefoxOptions);
I have realized that when the app running for 2 hours, it will use up to 8GB memory, and the get() call becomes extremely slow (could take around ~10 seconds).
My question is that do I miss any configuration when creating the WebDriver? Or any other solution to keep the memory usage in a low level, since I am considering to launch multiple (~100 WebDrivers) after deploying the app into the cloud.
The solution I am considering is that for a certain amount of operations, do driver.quit() for the current driver and initialize a new one. Does this sounds reasonable?