4

(warning, Newbie, slowly learning R)

Hi There,

I'm trying to download data automatically from a website using R. The website is using sharepoint and after asking (R download from aspx in https getting website instead of CSV) someone pointed me to RSelenium.

What I need is to download csv files from addresses like this: https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Production/ProductionByCountyExport.aspx?UNCONVENTIONAL_ONLY=false&INC_HOME_USE_WELLS=true&INC_NON_PRODUCING_WELLS=true&PERIOD=15AUGU&COUNTY=ALLEGHENY

But before I need to accept an agreement (a "click" I am doing with RSelenium) Code here:

# Using RSelenium to save file
##Installing the package if needed
install.packages("RSelenium")
##Activating 
library("RSelenium")
checkForServer()
startServer()
#I had to start the server manually!
remDr <- remoteDriver()
remDr
remDr$open()
#open website and accepting conditions
remDr$navigate("https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Welcome/Agreement.aspx")
AgreeButton<-remDr$findElement(using = 'id', value="MainContent_AgreeButton")
AgreeButton$highlightElement()
AgreeButton$clickElement()

remDr$navigate("https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Production/ProductionByCountyExport.aspx?UNCONVENTIONAL_ONLY=false&INC_HOME_USE_WELLS=true&INC_NON_PRODUCING_WELLS=true&PERIOD=15AUGU&COUNTY=ALLEGHENY")

My problem is: that I can not find the command in RSelenium for "save link as"

I figured that I need to find something of this type:

CSVurl<-remDr$navigate ("https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Production/ProductionByCountyExport.aspx?UNCONVENTIONAL_ONLY=false&INC_HOME_USE_WELLS=true&INC_NON_PRODUCING_WELLS=true&PERIOD=15AUGU&COUNTY=ALLEGHENY")remDr$navigate("https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Production/ProductionByCountyExport.aspx?UNCONVENTIONAL_ONLY=false&INC_HOME_USE_WELLS=true&INC_NON_PRODUCING_WELLS=true&PERIOD=15AUGU&COUNTY=ALLEGHENY")

CSVurl$saveLinkAs(fileName)

Does this exist? Are there better ways to do this in R?

Thanks in advance

Community
  • 1
  • 1
Pladiona
  • 93
  • 8
  • Hi @pladiona this can be done by setting firefox options see http://stackoverflow.com/questions/21944016/download-file-from-internet-via-r-despite-the-popup/21958555#21958555 – jdharrison Nov 25 '15 at 13:21

1 Answers1

1
`# Using RSelenium to save file
##Installing the package if needed

##Activating 
library(RSelenium)
checkForServer()
startServer()
#I had to start the server manually!

cprof<-makeFirefoxProfile(list(
  "browser.helperApps.neverAsk.saveToDisk"='text/plain, application/vnd.ms-excel, text/csv, text/comma-separated-values, application/octet-stream',
  "browser.helperApps.neverAsk.openFile"='text/plain, application/vnd.ms-excel, text/csv, text/comma-separated-values, application/octet-stream'

))
remDr <- remoteDriver(extraCapabilities=cprof)
remDr$open()
#open website and accepting conditions
remDr$navigate("https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Welcome/Agreement.aspx")
AgreeButton<-remDr$findElement(using = 'id', value="MainContent_AgreeButton")
AgreeButton$highlightElement()
AgreeButton$clickElement()

remDr$navigate("https://www.paoilandgasreporting.state.pa.us/publicreports/Modules/Production/ProductionByCountyExport.aspx?UNCONVENTIONAL_ONLY=false&INC_HOME_USE_WELLS=true&INC_NON_PRODUCING_WELLS=true&PERIOD=15AUGU&COUNTY=ALLEGHENY")`

To access the file you will have to search the default download folder of the firefox.

If you get an error saying that R is not able to create a cprof or not able zip the contents, then you probably need to install RTools.

From here

Check for exact version of R that you have installed.

Hope this helps.

Bharath
  • 1,600
  • 14
  • 25