38

I'm trying to automate the extraction of a number of files compressed with 7-zip. I need to automate this process, because a) there are many years of data I'd like to unlock and b) I'd like to share my code with others and prevent them from repeating the process by hand.

I have both WinRAR and 7-zip installed on my computer, and I can individually open these files easily with either program.

I've looked around at the unzip untar and unz commands, but I don't believe any of them do what I need.

I don't know anything about compression, but if it makes any difference: each of these files only contains one file and it's just a text file.

I would strongly prefer a solution that does not require the user to install additional software (like WinRAR or 7-Zip) and execute a command with shell, although I acknowledge this task might be impossible with just R and CRAN packages. I actually believe running shell.exec on these files with additional parameters might work on computers with WinRAR installed, but again, I'd like to avoid that installation if possible. :)

Running the code below will load the files I am trying to extract -- the .7z files in files.data are what needs to be unlocked.

# create a temporary file and temporary directory, download the file, extract the file to the temporary directory
tf <- tempfile() ; td <- tempdir()
file.path <- "ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2008_2009/Microdados/Dados.zip"
download.file( file.path , tf , mode = "wb" )
files.data <- unzip( tf , exdir = td )

# how do i unzip ANY of these .7z files?
files.data

Thanks!!! :)

Jaap
  • 81,064
  • 34
  • 182
  • 193
Anthony Damico
  • 5,779
  • 7
  • 46
  • 77
  • The best solution would be a package that could read and write 7z files using either the standard connection API or via temporary files on disk. But I don't think that package exists. – hadley Apr 19 '13 at 12:28
  • 1
    agreed. now i'm petitioning the folks at the brazilian census to follow @dirk's advice and re-post the files with a standard format :) thanks hadley! – Anthony Damico Apr 19 '13 at 12:32
  • the example your showing is a pkzip compresse file not a p7 compressed file. So your standard `unzip()` would work. A related question would be http://stackoverflow.com/questions/31146263/sys-glob-within-unzip – Dwight Spencer Jun 30 '15 at 19:22

2 Answers2

33

This can be done with the archive package.

library(archive)
tf <- tempfile() ; td <- tempdir()
file.path <- "ftp://ftp.ibge.gov.br/Orcamentos_Familiares/Pesquisa_de_Orcamentos_Familiares_2008_2009/Microdados/Dados.zip"
download.file( file.path , tf , mode = "wb" )
archive(tf)

See https://github.com/jimhester/archive

jsta
  • 3,216
  • 25
  • 35
26

If you have 7z executable in your path, you can simple use system command

system('7z e -o <output_dir> <archive_name>')

CHP
  • 16,981
  • 4
  • 38
  • 57
  • 3
    Completely misses the requirement of "I would strongly prefer a solution that does not require the user to install additional software". – Dirk Eddelbuettel Apr 19 '13 at 11:19
  • @DirkEddelbuettel but short of doing everything by hand, it's the only thing that works, right? :( – Anthony Damico Apr 19 '13 at 11:59
  • 17
    @DirkEddelbuettel ..from what you and hadley are saying, it's the _only_ answer. :( why would i delete the thread? others might also benefit from knowing this task is impossible without installing external software – Anthony Damico Apr 19 '13 at 12:35
  • 6
    note that the command is `system('7z e -o ')`. With a space between -o and the directory it fails! – 576i Nov 03 '16 at 13:54