I need to hash a message into a string of 30 chars. What's the best and most secure hash function for this usage?
-
It helps if you state the requirements of the hash function... – tc. Aug 21 '10 at 08:21
-
2What tc says. If you just want it to produce a "unique ID" of data that's essentially random, or only created by trusted people, then the hash doesn't need to be collision-resistant. If you use it in a context where it "proves" that data with equal hash is equal, and the data could be maliciously generated collisions, then standard practice is to use SHA-256 or SHA-512, because MD5 is completely broken (generate collisions at will), and SHA-1 isn't very future-proof since attacks appear to be steadily approaching feasibility. – Steve Jessop Aug 21 '10 at 15:52
2 Answers
Thirty characters (bytes) is 240 bits.
If you can't move the goal-post to allow 32 characters, then you will probably end up using SHA-1, which generates 160-bits or 20 bytes. When Base-64 encoded, that will be 28 characters. If you use a hex-encoding, it will be 40 characters, which is nominally out of range. With 32 characters, you could use SHA-256, but Base-64 encoding would increase that size (to 44 characters) and hex-encoding increases the size to 64 characters.
If you must use hex-encoding and can go to 32 bytes, then MD5 - which generates 128 bits - could be used, though it is not recommended for any new systems. With Base-64 encoding, MD5 uses 24 characters. Otherwise, you are using very minimally secure algorithms - not recommended at all.

- 730,956
- 141
- 904
- 1,278
-
4Alternatively, use the first 30 characters of base64-SHA-256, which is probably less broken than SHA-1. You could also investigate http://en.wikipedia.org/wiki/Ascii85 which can fit 192 bits of entropy in 30 chars. – tc. Aug 21 '10 at 08:21
Just use SHA1 and trim to 30 characters.
import hashlib
hash = hashlib.sha1("your message").hexdigest()[:30]
It's been proven that cutting characters off a cryptographically secure hash function such as SHA1 has negligible effects on its security (can't find the reference now though)

- 27,197
- 9
- 43
- 57
-
SHA-1's collisions will probably be found soon. If you don't care about collision-resistance, MD5 is sufficient. If you do, then the truncated hash only has a 60 bits of security, which isn't very much. – tc. Aug 21 '10 at 08:14
-
Surely for good hashes it reduces the security in proportion of the exponent. For example, finding a collision on a 40-char SHA1 hexdigest is 2^80 by brute force, so finding a collision on the first 30 chars is "only" 2^60 by brute force. The best known collision attack on SHA1 is actually only something like 2^51, though, not 2^80. I have no idea whether that approach can be adapted to find a first-120-bit collision in less than 2^51. The reason for the discrepancy, though: SHA1 is not really a "cryptographically secure" hash function any more, at least where collisions matter. – Steve Jessop Aug 21 '10 at 14:38
-
-
2@Steve 2^60 is a HUGE number. That's more than the age of the universe in seconds. And yes, SHA1 is still considered a cryptographically secure hash function, because 2^51 steps is still unfeasible to brute-force. Also, finding a way to generate collisions for entire hash in less steps DOES NOT translate into the same for the first 120 bits. – quantumSoup Aug 21 '10 at 15:06
-
4I wish people who have no clue about cryptography would stop downvoting on answers they have little to no knowledge to judge. FFS – quantumSoup Aug 21 '10 at 15:07
-
@tc "SHA-1's collisions will probably be found soon." And what makes you think that? – quantumSoup Aug 21 '10 at 15:11
-
Note for example that NIST has now forbidden SHA-1 for signed hashes. It's still considered good for HMAC and PRNG. A 2^51 collision attack on the full hash doesn't *necessarily* find a collision on the first 120 bits in 2^39 (which is entirely feasible), but it might find one in less than 2^51. Or there might be relatively minor adaptations that mean it could. Not knowing the details of how SHA-1 attacks work, I don't know, but I do know not to mess around with crypto primitives that are perilously close to the edge of sound... – Steve Jessop Aug 21 '10 at 15:46
-
The collision is on 51-round SHA-1; this is different from 2^51. I use "bits of security" to mean "binary log of the number of operations required to attack it", which is only 2^60 for a 120-bit hash. 2^60 is not that huge; MD5CRK was a bruteforce attack on MD5, abandoned when cryptologists found collisions with a better attack (but the Wikipedia page says it'd take 3 weeks on a supercomputer). Apparently there's also a new preimage attack on MD5, but it still requires more than 2^120 work. – tc. Aug 23 '10 at 02:15