There is bup
backup program (https://github.com/bup/bup) based on some ideas and some functions from git
version control system for compact storage of virtual machine images.
In bup
there is bup ls
subcommand, which can show some sha1-like hashes (same length of hex) of objects stored inside the backup when -s
option is passed (in man bup-ls
there is just "-s, --hash : show hash for each file/directory."). But the sha1-like hash is not equal to sha1sum
output of original file.
Original git
computes sha1 hash of data by prefixing data with `blob NNN\0' string, where NNN is size of object in bytes, written as decimal, according to How does git compute file hashes? and https://stackoverflow.com/a/28881708/
I tested prefix `blob NNN\0' and still not same sha1 sum.
What is the method of computing hash sum for files is used in bup? Is it linear sha1 or some tree-like variant like Merkle trees? What is the hash of directory?
The source of ls
command of bup is https://github.com/bup/bup/blob/master/lib/bup/ls.py, and hash just printed in hex, but where the hash was generated?
def node_info(n, name,
''' ....
if show_hash:
result += "%s " % n.hash.encode('hex')
Is that hash generated on creating bup backup (when file is placed inside to the backup by bup index
+ bup save
commands) and just printed out on bup ls
; or is it recomputed on every bup ls
and can be used as integrity test of bup backup?