7

How to change these gzip methods to xz?

This seems to work but runs really really REALLY slow... (~20-30x)

// gzip
... | gzip -c -1 > /path

// xz
... | xz -zf > /path

Havent tested this yet, but is this the way to compress with xz?

// gzip
tar -zcf /path /path

// xz
tar -Jcf /path /path
clarkk
  • 27,151
  • 72
  • 200
  • 340

2 Answers2

5

I had the same problem and ran some tests on large (~2GB) text-only log files.

It turned out that xz -0, that is, compressing with level 0, is faster and results in much smaller files.

time gzip -vc -9 test.log.1 > logs/test.log.gz9
real    2m34.079s
user    1m54.365s
sys     0m4.385s


time xz -vc -0 test.log.1 > logs/test.log.xz0
  100 %      53.1 MiB / 1,273.3 MiB = 0.042    11 MiB/s       1:53
real    1m53.779s
user    1m25.295s
sys     0m4.270s

time xz -vc -6 test.log.1 > logs/test.log.xz6
test.log.1 (1/1)
  100 %      53.9 MiB / 1,273.3 MiB = 0.042   798 KiB/s      27:13
real    27m13.968s
user    26m57.925s
sys     0m5.800s

-rw-r--r--  1 root   root    95M Sep  9 15:30 test.log.gz9
-rw-r--r--  1 root   root    54M Sep  9 15:38 test.log.xz0
-rw-r--r--  1 root   root    54M Sep  9 16:11 test.log.xz6

These tests were run on an arm mini computer with Ubuntu 14.4

Note that there's almost no difference in resulting file size between xz -0 and xz -6 (the default).

I'd even vote to make xz -0 the default...


On a regular machine, xz -0 was a bit slower and resulted in a much smaller file (input file had 4.2GB):

time tar -I 'xz -0' -cvf out.txz test.log
real    1m46.718s
user    1m42.000s
sys     0m23.084s

time tar -zcvf out.tgz test.log
real    1m13.778s
user    1m9.800s
sys     0m11.544s

-rw-rw-r--  1 root root 231M Sep 13 09:23 out.tgz
-rw-rw-r--  1 root root 125M Sep 13 09:37 out.txz
-rw-rw-r--  1 root root 4.2G Sep 13 09:07 test.log

From this answer:

tar -I 'xz -0' -cvf out.txz test.log

may speed up your archiving but will definitely give you smaller files.

Community
  • 1
  • 1
Martin Hennings
  • 16,418
  • 9
  • 48
  • 68
  • 1
    It depends on the sample. I tried a directory with lots of small files, total size ~1.8GB, resulting gz default 340M, xz default: 259M, xz -0: 324M, xz -6: 220M – Beeno Tung May 15 '19 at 03:57
  • @BeenoTung Do you have timings to these results? That would be interesting. Timigs aside your test shows that again the lowest (least effort) xz compression level still leads to smaller results than the default gz compression. – Martin Hennings May 15 '19 at 12:03
2

There's no such thing as a free lunch. If you want better compression, it will take more time (and memory). You can try lower compression levels for higher speed. The default is xz -6, so you can try levels 0 to 5 to see if there is a happy place for you in time vs. compression. (You don't need the -zf.)

Mark Adler
  • 101,978
  • 13
  • 118
  • 158