I am working with the R programming language.
I am trying to follow this tutorial here and learn about how to use Selenium: https://thatdatatho.com/tutorial-web-scraping-rselenium/
I first tried to get everything set up:
library(RSelenium)
library(tidyverse)
driver <- RSelenium::rsDriver(browser = "chrome",
chromever =
system2(command = "wmic",
args = 'datafile where name="C:\\\\Program Files (x86)\\\\Google\\\\Chrome\\\\Application\\\\chrome.exe" get Version /value',
stdout = TRUE,
stderr = TRUE) %>%
stringr::str_extract(pattern = "(?<=Version=)\\d+\\.\\d+\\.\\d+\\.") %>%
magrittr::extract(!is.na(.)) %>%
stringr::str_replace_all(pattern = "\\.",
replacement = "\\\\.") %>%
paste0("^", .) %>%
stringr::str_subset(string =
binman::list_versions(appname = "chromedriver") %>%
dplyr::last()) %>%
as.numeric_version() %>%
max() %>%
as.character())
remote_driver <- driver[["client"]]
remote_driver$navigate("https://www.latlong.net/convert-address-to-lat-long.html")
But this game the following error:
[1] "Connecting to remote server"
Could not open chrome browser.
Client error message:
Undefined error in httr call. httr output: Failed to connect to localhost port 4567: Connection refused
Check server log for further details.
Warning message:
In RSelenium::rsDriver(browser = "chrome", chromever = system2(command = "wmic", :
Could not determine server status.
What I tried: I tried to troubleshoot this error by consulting different references (e.g. can't execute rsDriver (connection refused)) and tried a "Docker" based suggestion:
shell('docker pull selenium/standalone-firefox')
shell('docker run -d -p 4445:4444 selenium/standalone-firefox')
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4445L, browserName = "firefox")
remDr$open()
remDr$navigate("http://www.google.com/ncr")
remDr$getTitle()
But I get the following error:
> shell('docker pull selenium/standalone-firefox')
Using default tag: latest
latest: Pulling from selenium/standalone-firefox
Digest: sha256:b6d8279268b3183d0d33e667e82fec1824298902f77718764076de763673124f
Status: Image is up to date for selenium/standalone-firefox:latest
docker.io/selenium/standalone-firefox:latest
What's Next?
View summary of image vulnerabilities and recommendations → docker scout quickview selenium/standalone-firefox
> shell('docker run -d -p 4445:4444 selenium/standalone-firefox')
57c117ceca095eb96113804cb8db3c64c197499af26c387b96d141d7d5a445df
> remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4445L, browserName = "firefox")
> remDr$open()
[1] "Connecting to remote server"
Error in checkError(res) :
Undefined error in httr call. httr output: Empty reply from server
> remDr$navigate("http://www.google.com/ncr")
Error in checkError(res) :
Undefined error in httr call. httr output: length(url) == 1 is not TRUE
> remDr$getTitle()
Error in checkError(res) :
Undefined error in httr call. httr output: length(url) == 1 is not TRUE
Can someone please suggest what to do from here? In general, is there some other way I can do this?
Thanks!
Notes:
Screenshots from Docker:
Here is the R Session
> sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)
Matrix products: default
locale:
[1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252 LC_MONETARY=English_Canada.1252 LC_NUMERIC=C LC_TIME=English_Canada.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] dplyr_1.0.9
- Additional References: How to set up rselenium for R?, https://www.youtube.com/watch?v=GnpJujF9dBw