-3

I am iterating through a folder containing binary files and am trying to compute for each file's hash values, specifically sha1 and sha256. On my runs, I weirdly get the same sha256 values for all files, but the sha1 values are different (thus correct).

Below is a screenshot of an output file which shows sha1 hashing was done correctly. But sha256 isn't. (Sorry filenames of each binary file is also its sha1)

wat is dis

Is there something wrong with my process? This is the relevant code in Python. I AM NOT SEEING SOMETHING. Sorry.

out.write("FILENAME,SHA1,SHA256\n")
for root, dirs, files in os.walk(input_path):
    for ffile in files:
        myfile = os.path.join(root, ffile)
        nice = os.path.join(os.getcwd(), myfile)

        fo = open(nice, "rb")
        a = hashlib.sha1(fo.read())
        b = hashlib.sha256(fo.read())
        paylname = os.path.basename(myfile)
        mysha1 = str(a.hexdigest())
        mysha256 = str(b.hexdigest())
        fo.close()

        out.write("{0},{1},{2}\n".format(paylname, mysha1, mysha256))
jowabels
  • 122
  • 1
  • 9
  • 2
    What do you think `fo.read()` will read when it has already read the file's contents? – deceze Aug 24 '17 at 10:25
  • 3
    when you do `fo.read()` for the sha1 hash, you have read the entire file, but you never move the cursor back to the start of the file. so the input to the sha256 hash is always the same (nothing) – James Kent Aug 24 '17 at 10:25
  • 1
    `hashlib.sha256(b'').hexdigest() == 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'` – Martijn Pieters Aug 24 '17 at 10:29
  • Aaah. How naive of me. Thanks for the catch! I'll accept your answer as soon as it's allowed – jowabels Aug 24 '17 at 10:29

1 Answers1

1

as i put in my comment above, you are reading the whole file for the first hash, but you need to seek back to the start of the file to read it a second time for the second hash. alternatively you could store it in a variable, and pass that to each hash.

out.write("FILENAME,SHA1,SHA256\n")
for root, dirs, files in os.walk(input_path):
    for ffile in files:
        myfile = os.path.join(root, ffile)
        nice = os.path.join(os.getcwd(), myfile)

        fo = open(nice, "rb")
        a = hashlib.sha1(fo.read())
        fo.seek(0,0) # seek back to start of file
        b = hashlib.sha256(fo.read())
        paylname = os.path.basename(myfile)
        mysha1 = str(a.hexdigest())
        mysha256 = str(b.hexdigest())
        fo.close()

        out.write("{0},{1},{2}\n".format(paylname, mysha1, mysha256))
James Kent
  • 5,763
  • 26
  • 50