0

I am having trouble opening a .xls file using openxlsx in R.

The code is quite simple:

library(rvest)
library(openxlsx)

lnk = "https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/internationalmigration/datasets/populationoftheunitedkingdombycountryofbirthandnationalityunderlyingdatasheets"

pg   = read_html(lnk)
files  = pg %>% html_nodes(".js-show-hide__content") %>% 
         html_nodes("a") %>% html_attr('href')

files = files[grepl("\\.xls",files)]
files = paste0("https://www.ons.gov.uk",files)

test =  read.xlsx(files[1],sheet=3,startRow=6)

but I get the following error message:

Error in file(con, "r") : invalid 'description' argument
In addition: Warning message:
In unzip(xlsxFile, exdir = xmlDir) : error 1 in extracting from zip file

If I copy and paste the link into the browser it downloads fine. It also works fine with download.file(files[1],"C:/etc...",mode='wb') and opening in Excel, but I'd rather read straight into R. When I have searched the error message in this context, the answer has either been that it can't open because the sheet is open in Excel already (don't think that's it as I'm opening direct from url), or that there are foreign characters in the file name (also not it).

I imagine it is something to do with the worksheet being protected, but I don't remember that being an issue before. Any help much appreciated.

Euan Ritchie
  • 362
  • 4
  • 15
  • 1
    Are you sure the package supports that extension? It seems to be for xls when using as you are. Perhaps use a method from here: https://stackoverflow.com/questions/41368628/read-excel-file-from-a-url-using-the-readxl-package - note there is a method to use a temp file which you can later unlink. – QHarr Mar 26 '21 at 04:13
  • arghh how stupid, tried opening straight from R and you're right I get the message "openxlsx can not read .xls" Thanks for pointing this out, and for the link – Euan Ritchie Mar 26 '21 at 10:33
  • Oops I missed the all important x off the end but you got what I meant :-) – QHarr Mar 26 '21 at 10:34

0 Answers0