I am trying to compute the sha256 sum of a gzipped file in Go, but my output does not match that of the gzip
command.
I have a function Compress
that gzips the contents of an io.Reader
, a file in my case.
func Compress(r io.Reader) (io.Reader, error) {
var buf bytes.Buffer
zw := gzip.NewWriter(&buf)
if _, err := io.Copy(zw, r); err != nil {
return nil, err
}
if err := zw.Close(); err != nil {
return nil, err
}
return &buf, nil
}
Then I have a function Sum256
that computes the sha256 sum of a reader.
func Sum256(r io.Reader) (sum []byte, err error) {
h := sha256.New()
if _, err := io.Copy(h, r); err != nil {
return nil, err
}
return h.Sum(nil), nil
}
My main function opens a file, gzips it, then computes the sha256 sum of the zipped contents. The problem is that the output does not match that of the gzip
command. The input file hello.txt
contains a single line with the word hello
with no newline at the end.
func main() {
uncompressed, err := os.Open("hello.txt")
if err != nil {
log.Fatal(err)
}
defer uncompressed.Close()
sum, err := Sum256(uncompressed)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%x %s\n", sum, uncompressed.Name())
uncompressed.Seek(0, 0)
compressed, err := Compress(uncompressed)
if err != nil {
log.Fatal(err)
}
sum, err = Sum256(compressed)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%x %s.gz\n", sum, uncompressed.Name())
}
gzip
results:
$ sha256sum hello.txt
2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 hello.txt
$ gzip -c hello.txt | sha256sum
809d7f11e97291d06189e82ca09a1a0a4a66a3c85a24ac7ff389ae6fbe02bcce -
$ gzip -nc hello.txt | sha256sum
f901eda57fd86d4239806fd4b76f64036c1c20711267a7bc776ab2aa45069b2a -
My program results:
$ go run main.go
# match
2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 hello.txt
# mismatch
3429ae8bc6346f1e4fb67b7d788f85f4637e726a725cf4b66c521903d0ab3b07 hello.txt.gz
Any idea why the outputs don't match or on how to fix this? I have tried using an io.Pipe
, ioutil.TempFile
file, and other methods with the same issue.