2

I don't know much about cryptography, but have, of course, come across the magic md5() black box on many occasions. Now, after some searching, I have been able to find many questions along the lines of "Why can't md5 be reversed?" with answers along the lines of "Because the output is non-unique." Given that the password (or whatever value) is evaluated against the hash (not the original), knowing the original isn't really even important, all you would really need to gain access to the user's account is any input that will give that hash. So, my question:

Why can't md5 be reverse engineered to generate an input (any input, not necessarily the original) from an output (the actual function is, after all, publicly available)?

Example: If I have a hash function that takes a string, assigns each character a number value, and then adds the values, I could easily reverse the function by taking the output, factoring it, and reorganizing the factors into a string (it won't be the original string, but it will give the same hash). So why can't md5 be run backwards?

djh101
  • 353
  • 3
  • 11
  • This question appears to be off-topic because it is not a programming question. – President James K. Polk Sep 10 '14 at 13:27
  • Me not being able to reverse a hash function would hardly explain why it isn't possible. And I don't see how hash functions aren't related to programming. – djh101 Sep 10 '14 at 22:32
  • 1
    But it *is* possible. It's possible, but's it's hard, very hard. This property you are looking to break is called pre-image resistance. So your next question might be "why is it hard?". And the answer is simply that Ron Rivest is a smart guy. – President James K. Polk Sep 10 '14 at 22:42
  • So something like mapping characters to primes and multiplying would be computationally unfeasible to reverse since it would require prime factorization? You're right, I think I did answer my own question. Thank you for the inspiration. – djh101 Sep 10 '14 at 23:00

2 Answers2

6

It is a one way formula. Running MD5 or SHA1 on a particular string gives a hash that is always the same. It isn't possible to reverse the function to get back to the original string.

For example:

15 Mod 4 = 3

Even if you know the formula is

x Mod 4

you can't deduce x as it could be 3, 7, 11, 15 etc...

Obviously MD5 and SHA1 are a lot more complex!

In the above example, imputing 15 will always give you the answer of 3, but nobody would be able to deduce the original number. This does lead nicely on to collisions where multiple input strings could give the same hash:

http://en.wikipedia.org/wiki/MD5#Collision_vulnerabilities

Wikipedia has information on the particular algorithm used:

http://en.wikipedia.org/wiki/MD5#Algorithm

Okkaaj
  • 115
  • 8
  • If the hash is being used for password protection, the original input is irrelevant. For your example, given that the hash maps to 3, any of those possible inputs would be perfectly suitable for generating the same hash. To access any account with a known hash, you could generate a working password using x = 4 + y. – djh101 Sep 10 '14 at 22:41
1

The main thing is that multiple inputs can achieve the same output. Also, because the operations are chained together and each stage of the operation has this same behavior, each time you try to reverse "backward", there are multiple possible answers - many of them. That is the basic characteristic that makes it "one way".

Brad Peabody
  • 10,917
  • 9
  • 44
  • 63
  • There are bijective one-way functions and there are lossy functions where finding a preimage is trivial. – CodesInChaos Sep 10 '14 at 10:24
  • Valid points, although not applicable specifically to md5. – Brad Peabody Sep 11 '14 at 00:30
  • If anyone knows of a bijective one-way function, I'd be really curious to see how it works. – Brad Peabody Jul 11 '21 at 20:12
  • 1
    Bijective one-way functions are known as *one-way permutations*. A trapdoor permutation like RSA is bijective, and if you get rid of the private key it becomes one-way. And exponentiation in a finite field or multiplication on an elliptic curve are at least injective and can probably be turned into a bijection with a bit of work. – CodesInChaos Jul 11 '21 at 20:48