3

I am writing a simple tool that allows me to quickly check MD5 hash values of downloaded ISO files. Here is my algorithm:

import sys
import hashlib

def main():
    filename = sys.argv[1] # Takes the ISO 'file' as an argument in the command line
    testFile = open(filename, "r") # Opens and reads the ISO 'file'

    # Use hashlib here to find MD5 hash of the ISO 'file'. This is where I'm having problems
    hashedMd5 = hashlib.md5(testFile).hexdigest()

    realMd5 = input("Enter the valid MD5 hash: ") # Promt the user for the valid MD5 hash

    if (realMd5 == hashedMd5): # Check if valid
        print("GOOD!")
    else:
        print("BAD!!")

main()

My problem is on the 9th line when I try to take the MD5 hash of the file. I'm getting the Type Error: object supporting the buffer API required. Could anyone shed some light on to how to make this function work?

BЈовић
  • 62,405
  • 41
  • 173
  • 273
  • Various approaches discussed here: http://stackoverflow.com/questions/1131220/get-md5-hash-of-a-files-without-open-it-in-python – Zach Kelling Jul 18 '11 at 01:31
  • Thank you! I saw that post and read through it, but still didn't fully understand –  Jul 18 '11 at 01:45

2 Answers2

8

The object created by hashlib.md5 doesn't take a file object. You need to feed it data a piece at a time, and then request the hash digest.

import hashlib

testFile = open(filename, "rb")
hash = hashlib.md5()

while True:
    piece = testFile.read(1024)

    if piece:
        hash.update(piece)
    else: # we're at end of file
        hex_hash = hash.hexdigest()
        break

print hex_hash # will produce what you're looking for
Jeremy
  • 1
  • 85
  • 340
  • 366
3

You need to read the file:

import sys
import hashlib

def main():
    filename = sys.argv[1] # Takes the ISO 'file' as an argument in the command line
    testFile = open(filename, "rb") # Opens and reads the ISO 'file'

    # Use hashlib here to find MD5 hash of the ISO 'file'. This is where I'm having problems
    m = hashlib.md5()
    while True:
        data = testFile.read(4*1024*1024)
        if not data: break
        m.update(data)
    hashedMd5 = m.hexdigest()
    realMd5 = input("Enter the valid MD5 hash: ") # Promt the user for the valid MD5 hash

    if (realMd5 == hashedMd5): # Check if valid
        print("GOOD!")
    else:
        print("BAD!!")

main()

And you probably need to open the file in binary ("rb") and read the blocks of data in chunks. An ISO file is likely too large to fit in memory.

hughdbrown
  • 47,733
  • 20
  • 85
  • 108