0

can someone point me to a way of unzipping and opening .7z files through R?

Here is an example of a file that I want to download:

 utils::download.file(
          url = "ftp://ftp.mtps.gov.br/pdet/microdados/RAIS/AC2008.7z")

All files that I want are .txt once unzipped.

If I try unzip("./AC2008.7z"), I get the message:

In unzip(fileName, exdir = mainDir, subDir) : error 1 in extracting from zip file

Any help?

I don't necessarily need to unzip the files - If somehow R opened the underlying .txt directly, it would be okay.

The solution should be such that it is implementable within a function in a package.

  • I don't know of any package implementing 7zip for R - would it be acceptable to run utils available in your OS? If so, see https://stackoverflow.com/questions/16096192/how-to-programmatically-extract-unzip-a-7z-7-zip-file-with-r – Sirius Mar 29 '21 at 22:21
  • @Sirius my issue is that I need this to be done in a function within a package – Arthur Carvalho Brito Mar 29 '21 at 22:22
  • Well, that doesn't really rule it out. Do you have control or knowledge of the computers of the users of the package? You could advice them or give instructions on how to install 7zip in their OS for example. – Sirius Mar 29 '21 at 22:24
  • 1
    See if this doesn't help: https://stackoverflow.com/questions/16096192/how-to-programmatically-extract-unzip-a-7z-7-zip-file-with-r – Kat Mar 29 '21 at 22:27
  • [The archive package](https://github.com/jimhester/archive) will open 7zip format. It was removed from CRAN, so is only available via Github and requires Rtools (on Windows) for installation. By the way I am unable to access the FTP server with your example file. – neilfws Mar 29 '21 at 22:50

1 Answers1

3

The archive package will open 7zip format.

You will need to install the devtools package to install it.

devtools::install_github("jimhester/archive")

I'm unable to access your example file on the FTP server. Assuming that it is a multi-file archive of .txt files, you would access it like this:

a <- archive("AC2008.7z")

Assuming it contained a file named x.txt with columns delimited by white space, you might do something like:

library(readr)
x <- read_table(archive_read(a, "x.txt"))
neilfws
  • 32,751
  • 5
  • 50
  • 63