1

here is my script:

tar cf - testdir | pv -s $(du -sb testdir | awk '{print $1}') | pigz -1 > pv.tar.gz

tar cf - testdir | pigz -1 > nopv.tar.gz

diff pv.tar.gz nopv.tar.gz

and then the output is "Binary files pv.tar.gz and nopv.tar.gz differ".

I execute hexdump

and I found that only the first line of these two files is slightly different

pv.tar.gz: 8b1f 0008 9e24 5fc8 0304 bdec 5f7b c71b

nopv.tar.gz: 8b1f 0008 9c18 5fc8 0304 bdec 5f7b c71b

But after I unzipped it and compared it again, the testdir is exactly the same.

What I want to ask is, how can I make the two tar.gz files consistent?

CJD
  • 185
  • 1
  • 2
  • 8
  • Your hexdump looks wrong. The first 4 bytes should be `8b 1f 08 00`, not `...00 08`. Can you confirm that you typoed them, or something else is wrong. – seumasmac Dec 03 '20 at 10:36

1 Answers1

1

It's not to do with pv. Bytes 5 to 8 in a gzip header are the timestamp. This will be different each time you run the command. You can tell pigz not to store it with the -m switch, so your commands are:

tar cf - testdir | pv -s $(du -sb testdir | awk '{print $1}') | pigz -1 -m > pv.tar.gz

tar cf - testdir | pigz -1 -m > nopv.tar.gz

which should give you the same content. You'll notice when you hexdump that the values that changed are all 00 now.

seumasmac
  • 2,174
  • 16
  • 7