unzip a tar.gz file?

Question

I wish to download and open the following tar.gz file in R:

http://s.wordpress.org/resources/survey/wp2011-survey.tar.gz

Is there a command which can accomplish this?

@Ramnath, much closer than joran's: maybe worth closing/merging ... — Ben Bolker, Aug 22 '11 at 18:45
Sorry for the ultra duplicates. I searched a bit before posting - but apparently not enough. My apologies. — Tal Galili, Aug 22 '11 at 21:49
For me archive_extract("tmp.tar.gz", files="wp2011-survey/anon-data.csv") from library(archive) is quite a bit faster than the in-built base R untar (especially for large archives) and it works very well on all platforms... You can also use it to read a csv directly from an archive without unpacking it using read_csv(archive_read("tmp.tar.gz", file = 3), col_types = cols()). It supports 'tar', 'ZIP', '7-zip', 'RAR', 'CAB', 'gzip', 'bzip2', 'compress', 'lzma' and 'xz' formats. So for me that would be the preferred option. — Tom Wenseleers, Jul 11 '22 at 15:29

Ben Bolker · Accepted Answer · 2022-07-11T15:36:19.180

63

fn <- "http://s.wordpress.org/resources/survey/wp2011-survey.tar.gz"
download.file(fn,destfile="tmp.tar.gz")
untar("tmp.tar.gz",list=TRUE)  ## check contents
untar("tmp.tar.gz")
## or, if you just want to extract the target file:
untar("tmp.tar.gz",files="wp2011-survey/anon-data.csv")
X <- read.csv("wp2011-survey/anon-data.csv")

Tom Wenseleers points out that the archive package can help with this:

library(archive)
library(readr)
read_csv(archive_read("tmp.tar.gz", file = 3), col_types = cols())

and that archive::archive_extract("tmp.tar.gz", files="wp2011-survey/anon-data.csv") is quite a bit faster than the in-built base R untar (especially for large archives) It supports 'tar', 'ZIP', '7-zip', 'RAR', 'CAB', 'gzip', 'bzip2', 'compress', 'lzma' and 'xz' formats.

edited Jul 11 '22 at 15:36

answered Aug 22 '11 at 17:42

Ben Bolker

211,554
25
370
453

is it also possible to untar only a specific file inside a tarball?? I think the `files` argument in `untar` does this but am unsure how ?? Help appreciated .. – Ashwin Dec 08 '14 at 10:27
For me archive_extract("tmp.tar.gz", files="wp2011-survey/anon-data.csv") from library(archive) is quite a bit faster than the in-built base R untar (especially for large archives) and it works very well on all platforms... You can also use it to read a csv directly from an archive without unpacking it using read_csv(archive_read("tmp.tar.gz", file = 3), col_types = cols()). It supports 'tar', 'ZIP', '7-zip', 'RAR', 'CAB', 'gzip', 'bzip2', 'compress', 'lzma' and 'xz' formats. So for me that would be the preferred option. – Tom Wenseleers Jul 11 '22 at 15:29

unzip a tar.gz file?

1 Answers1

Linked

Related