10

I want to serialize/deserialize md5 context. But I don't know how to do it in Python. Pseudocode of what I want to do.

import md5
# Start hash generation
m = md5.new()
m.update("Content")

# Serialize m
serialized_m = serialize(m)

# In another function/machine, deserialize m
# and continue hash generation
m2 = deserialize(serialized_m)
m2.update("More content")
m2.digest()    

There are C++ libraries for this. Is there one for Python? Why doesn't the md5 library support it? Are there security concerns? Thanks.

Edited: I want to do this because for example, an HTTP server wants to accept streaming data in different HTTP requests. It would be convenient to serialize md5 context somehow between requests.

Yey
  • 554
  • 3
  • 15
  • 3
    http://stackoverflow.com/questions/5865824/hash-algorithm-for-dynamic-growing-streaming-data – Kevin Sep 25 '12 at 23:34
  • Thanks. The pypy library says not use it cuz it's not tested =( why doesn't the official python md5 implement this tho? – Yey Sep 25 '12 at 23:44

2 Answers2

1

HASH objects are not serializable: How to serialize hash objects in Python

Assuming you can pass around the unhashed data:

from Crypto.Hash import MD5

# generate hash
m = MD5.new()
s = "foo"
m.update(s)

# serialize m
serialized = s

# deserialize and continue hash generation
m2 = MD5.new(serialized)
if m2.hexdigest() == m.hexdigest():
    print "success"
m2.update("bar")
Community
  • 1
  • 1
Ryan Shea
  • 4,252
  • 4
  • 32
  • 32
1

I asked Mr Guido V Rossum. He replied that "I don't think there's a way. However it might make a decent feature request. You could submit one to bugs.python.org." So I did.

http://bugs.python.org/issue16059

Yey
  • 554
  • 3
  • 15