1

I am working on my company's R Studio server and would like to access data published on the ONS website

I wrote a few lines that build the correct url but I'm blocked when trying to read the file from the url

Here's a simplified example (i.e. with URL hardcoded):

library(gdata)
currUrl <- "http://www.ons.gov.uk/file?uri=/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/weeklyprovisionalfiguresondeathsregisteredinenglandandwales/2017/publishedweek302017.xls"
data <- read.xls(currUrl)

I get the following error :

> ERROR: The certificate of www.ons.gov.uk' is not trusted.  ERROR: The
> certificate of www.ons.gov.uk' hasn't got a known issuer. Error
> parsing file '/tmp/RtmpXou27y/file386f520067bd.xls'. Error in
> xls2sep(xls, sheet, verbose = verbose, ..., method = method,  :   
> Intermediate file '/tmp/RtmpXou27y/file386f4e7dd580.csv' missing! In
> addition: Warning messages: 1: In download.file(xls, tf, mode = "wb")
> :   download had nonzero exit status 2: running command
> ''/usr/bin/perl'
> '/home/nr/R/x86_64-pc-linux-gnu-library/3.2/gdata/perl/xls2csv.pl' 
> '/tmp/RtmpXou27y/file386f520067bd.xls'
> '/tmp/RtmpXou27y/file386f4e7dd580.csv' '1'' had status 255  Error in
> file.exists(tfn) : invalid 'file' argument

After some research on the website I tried a few different things such as changing the http to https or trying to download the file before reading it but nothing seems to work. I get a similar error when trying to download the file before reading it.

Axeman
  • 32,068
  • 8
  • 81
  • 94
naro
  • 85
  • 6
  • Mmm... maybe something on your side. With R 3.3 I can read the file. Try on another computer. – Fabio Marroni Aug 11 '17 at 13:44
  • And `read.xls` is from which package? – Axeman Aug 11 '17 at 13:50
  • read.xls is from package gdata. The idea comes from here https://stackoverflow.com/questions/21738463/importing-excel-file-using-url-using-read-xls Anybody knows another way to import this file? No preference on the method.. – naro Aug 11 '17 at 13:51
  • [related](https://stackoverflow.com/questions/41368628/read-excel-file-from-a-url-using-the-readxl-package) – Axeman Aug 11 '17 at 13:57
  • thanks. That seems to work if I add set_config(config(ssl_verifypeer = 0L)). Without this I get "Error in curl::curl_fetch_disk(url, x$path, handle = handle) : Peer certificate cannot be authenticated with given CA certificates" If I understand correctly, I'm essentially bypassing some security check. Is there another way to do this? – naro Aug 11 '17 at 14:03

0 Answers0