It's odd that no-one else has mentioned that modern versions of GNU tar
allow you to compress as you are bundling:
tar -czf output.tar.gz directory1 ...
tar -cjf output.tar.bz2 directory2 ...
You can also use the compressor of your choosing provided it supports the '-c
' (to stdout, or from stdin) and '-d
' (decompress) options:
tar -cf output.tar.xxx --use-compress-program=xxx directory1 ...
This would allow you to specify any alternative compressor.
[Added: If you are extracting from gzip
or bzip2
compressed files, GNU tar
auto-detects these and runs the appropriate program. That is, you can use:
tar -xf output.tar.gz
tar -xf output.tgz # A synonym for the .tar.gz extension
tar -xf output.tar.bz2
and these will be handled properly. If you use a non-standard compressor, then you need to specify that when you do the extraction.]
The reason for the separation is, as in the selected answer, the separation of duties. Amongst other things, it means that people could use the 'cpio
' program for packaging the files (instead of tar
) and then use the compressor of choice (once upon a time, the preferred compressor was pack
, later it was compress
(which was much more effective than pack
), and then gzip
which ran rings around both its predecessors, and is entirely competitive with zip
(which has been ported to Unix, but is not native there), and now bzip2
which, in my experience, usually has a 10-20% advantage over gzip
.
[Added: someone noted in their answer that cpio
has funny conventions. That's true, but until GNU tar
got the relevant options ('-T -
'), cpio
was the better command when you did not want to archive everything that was underneath a given directory -- you could actually choose exactly which files were archived. The downside of cpio
was that you not only could choose the files -- you had to choose them. There's still one place where cpio
scores; it can do an in-situ copy from one directory hierarchy to another without any intermediate storage:
cd /old/location; find . -depth -print | cpio -pvdumB /new/place
Incidentally, the '-depth
' option on find
is important in this context - it copies the contents of directories before setting the permissions on the directories themselves. When I checked the command before entering the addition to this answer, I copied some read-only directories (555 permission); when I went to delete the copy, I had to relax the permissions on the directories before 'rm -fr /new/place
' could finish. Without the -depth
option, the cpio
command would have failed. I only re-remembered this when I went to do the cleanup - the formula quoted is that automatic to me (mainly by virtue of many repetitions over many years).
]