6

I have an url, and I want to download the file via R, I notice that download.file would be helpful, but my problem seems different:

url <- "http://journal.gucas.ac.cn/CN/article/downloadArticleFile.do?attachType=PDF&id=11771"
destfile <- "myfile.pdf"
download.file(url, destfile)

It doesn't work! I notice that if my url is in the form of xxx.pdf, then the code above is no problem, otherwise the file that is downloaded is corrupt.

Does anyone know how to solve this problem?

nico
  • 50,859
  • 17
  • 87
  • 112
PepsiCo
  • 1,399
  • 4
  • 13
  • 18
  • 2
    Please define `it doesn't work`. I can download the file using `download.file` and open it using a PDF reader, so I cannot reproduce your problem. My first guess would be you are behind a web proxy... – Paul Hiemstra Nov 17 '13 at 07:03
  • 1
    I run the code, and can get the file "myfile.pdf", but when I click on the file "myfile.pdf", I can not open it, so I think maybe I didn't download it in the right way. – PepsiCo Nov 17 '13 at 07:09
  • The download works fine for me too, and I can open the PDF. The PDF is in Chinese, though, maybe you are simply missing Chinese fonts? – nico Nov 17 '13 at 12:12

2 Answers2

9

Setting the mode might be required to treat the file as binary data while saving it. If I leave that argument out, I get a blank file, but this way works for me:

url <- "http://journal.gucas.ac.cn/CN/article/downloadArticleFile.do?
attachType=PDF&id=11771"
destfile <- "myfile.pdf"
download.file(url, destfile, mode="wb")
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
Troy
  • 8,581
  • 29
  • 32
  • Can you specify your operating system and version of R? The `mode` parameter is only used when the `internal` method is used. The default method is `auto` so I suspect that it may depend on the OS. I have no problem in downloading the file using R 3.0.2 under FC18 64bit. – nico Nov 17 '13 at 15:06
  • platform x86_64-w64-mingw32 arch x86_64 os mingw32 system x86_64, mingw32 status major 3 minor 0.2 year 2013 month 09 day 25 svn rev 63987 – Troy Nov 18 '13 at 11:11
  • I think probably it's to do with how the ISP transmits http binary streams, which would explain why it works for some people without enforcement of mode and not for others. – Troy Nov 18 '13 at 11:12
  • 1
    I need to add method='curl' also to not get a corrupt pdf. RStudio Version 0.98.1103 R 3.1.2 on windows. – mfrellum Apr 08 '15 at 08:07
  • Im using all these methods and its still not creating a workable file for me. It fails on trying to open it up. – SqueakyBeak Jun 08 '20 at 14:27
-2

I am trying to download an nc file with R. It downloads well but I get this error when trying to open it:

Error in R_nc4_open: NetCDF: Unknown file format Error in nc_open("SM_D2010323_Map_SATSSS_data_1day.nc") : Error in nc_open trying to open file SM_D2010323_Map_SATSSS_data_1day.nc (return_on_error= FALSE )

url <- "https://www.star.nesdis.noaa.gov/data/socd1/coastwatch/products/miras/nc/SM_D2010323_Map_SATSSS_data_1day.nc"
destfile <- "***/SM_D2010323_Map_SATSSS_data_1day.nc"
download.file(url, destfile)
nc_data <- nc_open('SM_D2010323_Map_SATSSS_data_1day.nc')

But when I use the same URL on my web browser, I can open the file without any problems with R.