3

I wan't to use Selenium for webscrapping from R.

  • My Windows version: Windows 11, 21H2
  • I have the latest Java update. (1.8.0_351) Commenting it since I've seen it could be a fix in this cases.

However, when defining the driver object I get the following error:

Could not open chrome browser.
Client error message:
Undefined error in httr call. httr output: Failed to connect to localhost port 14415: Connection refused
Check server log for further details.
Warning message:
In rsDriver(browser = "chrome", chromever = "109.0.5414.74", verbose = FALSE,  :
  Could not determine server status. 

When checking the server log for further details I get:

Could not find or load main class c(-Dwebdriver.chrome.driver=\"C:\\\\Users\\\\xherr\\\\AppData\\\\Local\\\\binman\\\\binman_chromedriver\\\\win32\\\\109.0.5414.74.chromedriver.exe\","

Here's my code:

library(tidyverse)
library(RSelenium)
library(netstat)
library(Rcpp)
library(wdman)

binman::list_versions("chromedriver")


rdriver <- rsDriver(browser = "chrome",
                    chromever = "109.0.5414.74",
                    verbose = TRUE,
                    port = free_port())

rdriver$server$log()

Does anyone know how to fix this? Thank you very much

Etxeberri
  • 43
  • 5

5 Answers5

2

The chrome driver version "109.0.5414.74" now includes an additional file that confuses wdman/binman. Using an earlier version like chromever = "108.0.5359.71" will work. Alternatively you can use the newer drivers by finding your chrome driver path using selenium(retcommand = T) and deleting the LICENSE.chromedriver files

  • Great answer. Although, could you elaborate on "..finding your chrome driver path using selenium(retcommand = T)"? I already knew where the drivers were, but am curious what the exact syntax you were using to find them? I am unfamiliar with this "retcommand" you mentioned. – PaulStock Jan 23 '23 at 19:31
  • Yes, sorry. selenium() is a function that is part of the wdman package. It will output the shell command that wdman uses to find the driver. You will see that the issue is that the function output contains an R vector "c()" of two files (caused by a regex ambiguity in read.files() and both the driver and the license file ending in "CHROMEDRIVER"). I found information about retcommand in other issues on SO and in the vignette for wdman. https://cran.r-project.org/web/packages/wdman/vignettes/basics.html – bingbongtelecom Jan 25 '23 at 22:49
0

I have experienced exactly the same issue since I upgraded to chromedriver 109.0.5414.74. It might be a problem with the latest chromedriver version.

The only way I make rsDriver to run again was by replacing in your code 109.0.5414.74 by 108.0.5359.71.

Try:

library(tidyverse)
library(RSelenium)
library(netstat)
library(Rcpp)
library(wdman)
    
binman::list_versions("chromedriver")
    
    
rdriver <- rsDriver(browser = "chrome",
chromever = "108.0.5359.71",
verbose = TRUE,
port = free_port())
    
rdriver$server$log()
jffj
  • 1
0

Execute this code in a separate script:

library(wdman) 
selenium(retcommand=T)

In the console read the path where is the file LICENSE.chromedriver in your system. This file will be in the chromedriver folders from version 109 and above.

Delete LICENSE.chromedriver.

Now your code shuold be ok again when you run the rsDriver().

0

You can consider the two following approaches :

library(RSelenium)
shell('docker run -d -p 4445:4444 selenium/standalone-firefox')
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4445L, browserName = "firefox")
remDr$open()
remDr$navigate("https://www.google.com/")

and

library(RSelenium)
library(wdman)

url <- "https://www.google.com/"
port <- as.integer(4444L + rpois(lambda = 1000, 1))
pJS <- wdman::phantomjs(port = port)
remDrPJS <- remoteDriver(browserName = "phantomjs", port = port)
remDrPJS$open()
remDrPJS$navigate(url)

You need to install Docker for the first approach. It is easy to install.

Emmanuel Hamel
  • 1,769
  • 7
  • 19
-1

I recommend 'remoteDriver' instead of 'rsDriver'

binman::list_versions(appname = 'chromedriver')
driver<-wdman::chrome(port=4576L, version = '108.0.5359.71')

# first data
remote<-remoteDriver(port=4576L, browserName='chrome')
remote$open()
jaechang
  • 29
  • 3