2

I've been trying to download an excel file (.xls) from a particular website. I am pasting my whole R code below (after setting up a docker container).

 ePrefs = makeFirefoxProfile(
 list(
 browser.download.dir = "/home/seluser/Downloads",
 "browser.download.folderList" = 2L,
 "browser.download.manager.showWhenStarting" = FALSE,
 "browser.helperApps.neverAsk.saveToDisk" = "application/vnd.ms-excel, 
  application/xls, application/x-xls, application/vnd-xls, 
  application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
  ))

  remDr = remoteDriver(extraCapabilities = ePrefs, port = 4445)
  remDr$open()
  remDr$navigate("https://www.aeaweb.org/joe/listings?")

  webelem1 = remDr$findElement(using = 'id', "published-date")
  webelem1$clickElement()

  webelem2 = remDr$findElement("css", "[value = 'week']")
  webelem2$clickElement()

  webelem3 = remDr$findElement("css", "[value = 'Apply Filter']")
  webelem3$clickElement()
  Sys.sleep(10)

  webelem4 = remDr$findElement("css", "[feature = 'download']")
  webelem4$clickElement()

 webelem5 = remDr$findElement("xpath", 
 "/html/body/main/div/section/div/div[2]/div[2]/div/ul/li[3]/a")
 webelem5$clickElement()

Everything works fine, but at the last step (click) the selenium browser still opens up the usual dialogue window asking me if I want to save the file or open it, even though I have the overriding commands in the eprefs bit of the code.

I have manually downloaded the file that the last click should be directly downloading and verified that the content type is application/vnd.ms-excel. Is there something I am doing wrong? Any help is appreciated.

  • When you initiated your docker container did you map the download locations between HOST and container? see https://stackoverflow.com/questions/42293193/rselenium-on-docker-where-are-files-downloaded and https://stackoverflow.com/questions/42607389/download-file-with-rselenium-docker-toolbox – jdharrison Jul 09 '17 at 06:03

1 Answers1

0

The mime type the server is returning is application/force-download. Add this to your list and observe HOST and container download locations are mapped and the following works for me:

# initiate docker container mapping download locations
# here HOST is linux
# docker run -d -p 4445:4444 -p 5901:5900 -v /home/john/test:/home/seluser/Downloads selenium/standalone-firefox-debug:2.53.1

library(RSelenium)
ePrefs <- makeFirefoxProfile(
  list(
    browser.download.dir = "/home/seluser/Downloads",
    "browser.download.folderList" = 2L,
    "browser.download.manager.showWhenStarting" = FALSE,
    "browser.helperApps.neverAsk.saveToDisk" = "application/vnd.ms-excel, 
    application/xls, application/x-xls, application/vnd-xls, 
    application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,
    application/force-download"
  ))

remDr <- remoteDriver(extraCapabilities = ePrefs, port = 4445)
remDr$open()
remDr$navigate("https://www.aeaweb.org/joe/listings?")

webelem1 <- remDr$findElement(using = 'id', "published-date")
webelem1$clickElement()

webelem2 <- remDr$findElement("css", "[value = 'week']")
webelem2$clickElement()

webelem3 <- remDr$findElement("css", "[value = 'Apply Filter']")
webelem3$clickElement()
Sys.sleep(10)

webelem4 <- remDr$findElement("css", "[feature = 'download']")
webelem4$clickElement()

webelem5 = remDr$findElement("xpath", 
                             "/html/body/main/div/section/div/div[2]/div[2]/div/ul/li[3]/a")
webelem5$clickElement()

list.files("/home/john/test/")

> list.files("/home/john/test/")
[1] "joe_resultset.xls"
jdharrison
  • 30,085
  • 4
  • 77
  • 89