1

I've got a bunch of bytes I want to output in a human-friendly fashion (using characters that will be available and printable in any font/encoding/etc.). In my case, the bytes are the result of an md5 sum:

import hashlib
h = hashlib.md5("foo")

The HASH object has two ways of displaying its contents to me.

print h.digest() # Uses a bunch of unprintable characters
print h.hexdigest() # Readable, but 32 characters long

The second option gives me a well-behaved string of characters that I can read, cut and paste, or whatever. But it's an inefficient representation: it only uses 16 characters because it's hexadecimal. It could give me a shorter string if it used the whole alphabet, uppercase letters, punctuation, etc. Can I get a shorter, denser digest by expanding beyond hex?

kuzzooroo
  • 6,788
  • 11
  • 46
  • 84
  • possible duplicate of [convert integer to a string in a given numeric base in python](http://stackoverflow.com/questions/2267362/convert-integer-to-a-string-in-a-given-numeric-base-in-python) – vaultah Jan 07 '15 at 18:41
  • 2
    Check out `base64.b64encode`: `base64.b64encode(h.digest())`->'rL0Y20zC+Fzt72VPzMSk2A==' – phobic Jan 07 '15 at 18:49
  • An Aesthetic Comparison of Human-Readable Hashing Functions: https://gist.github.com/raineorshine/8d67049c0aaaa082614e417660462fda – Raine Revere Feb 14 '17 at 05:31

1 Answers1

0

Here's a modified version of one of the answers to the question that @vaultah linked to:

import hashlib, string, base64

_INT_EFFICIENT_CHARS = string.letters + string.digits + string.punctuation
_L_INT_EFFICIENT_CHARS = len(_INT_EFFICIENT_CHARS)
# http://stackoverflow.com/questions/2267362/convert-integer-to-a-string-in-a-given-numeric-base-in-python
def int_efficient(x):
    rets=''
    while x>0:
        x,idx = divmod(x, _L_INT_EFFICIENT_CHARS)
        rets = _INT_EFFICIENT_CHARS[idx] + rets
    return rets

h = hashlib.md5("foo")
print h.hexdigest()

# Starting in Python 3.2, use int.from_bytes rather than converting to hex
# http://stackoverflow.com/a/9634417/2829764
i = int(h.hexdigest(), 16)
print int_efficient(i)

print base64.b64encode(h.digest())

Using my alphabet (94 characters) only shortens the result by a few characters relative to base64:

acbd18db4cc2f85cedef654fccc4a4d8
Hpf=RjPL{_{4Q-[X$vdO
rL0Y20zC+Fzt72VPzMSk2A==
kuzzooroo
  • 6,788
  • 11
  • 46
  • 84