1

I'm trying to download a file in R on a remote server which sits behind a number of proxies. Something - I can't figure out what - is causing the file to be returned cached whenever I try and access it on that server, whether I do so through R or just through a Web Browser.

I've tried using cacheOK=FALSE in my download.file call and this has had no effect.

Per Is there a way to force browsers to refresh/download images? I have tried adding a random suffix to the end of the URL:

download.file(url = paste("http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_daily.zip?",
                          format(Sys.time(), "%d%m%Y"),sep=""), 
              destfile = "F-F_Research_Data_Factors_daily.zip", cacheOK=FALSE)

This produces, e.g., the following URL:

http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_daily.zip?17092012

Which when accessed from a Web Browser on the remote server, indeed returns the latest version of the file. However, when accessed using download.file in R, this returns a corrupted zip archive. Both WinRAR and R's unzip function complain that the zip file is corrupt.

unzip("F-F_Research_Data_Factors_daily.zip")
1: In unzip("F-F_Research_Data_Factors_daily.zip") :
internal error in unz code

I can't see why downloading this file via R would cause a corrupted file to be returned, whereas downloading it via a Web Browser gives no problem.

Can anyone suggest either a way to beat the cache from R (about which I'm not hopeful), or a reason why download.file doesn't like my URL with ?someRandomString tacked onto the end of it?

Community
  • 1
  • 1
Ina
  • 4,400
  • 6
  • 30
  • 44

1 Answers1

4

It will work if you use mode="wb"

download.file(url = paste("http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_daily.zip?",format(Sys.time(),"%d%m%Y"),sep=""), 
          destfile = "F-F_Research_Data_Factors_daily.zip", mode='wb', cacheOK=FALSE)
GSee
  • 48,880
  • 13
  • 125
  • 145
  • 1
    @Ina, [here is a blog](http://timelyportfolio.blogspot.com/search/label/french) that uses data from Kenneth French -- may be useful to see how he downloads the files. – GSee Sep 18 '12 at 14:09