Does any published research indicate that preimage attacks on MD5 are imminent?

Question

I keep on reading on SO that MD5 is broken, bust, obsolete and never to be used. That angers me.

The fact is that collision attacks on MD5 are now fairly easy. Some people have collision attacks down to an art and can even us use them to predict elections.

I find most of the examples MD5 "brokeness" less interesting. Even the famous CA certificate hack was a collision attack meaning that its provable that the party generated the GOOD and EVIL certificates at same time. This means that if the EVIL CA found its way into the wild, it is provable that it leaked from the person who had the good CA and thus was trusted anyway.

What would be a lot more concerning is a preimage or second preimage attack.

How likely is a preimage attack on MD5? Is there any current research to indicate that it is imminent? Is the fact that MD5 is vulnerable to collision attacks make it more likely to suffer a preimage attack?

-1 for "predict the future" style questions, which are by definition subjective and argumentative — Orion Edwards, May 04 '09 at 23:51
@Orion I'm not looking for "gut feel" here, I'm looking for something concrete that involves proper statistical analysis. I'm looking for published research. I'm looking for a logical argument. This is just as subjective and argumentative as anything? — Sam Saffron, May 04 '09 at 23:58
You are mistaken about the implications of the CA attack. The "GOOD" certificate is a legitimately obtained SSL site certificate that the attacker is entitled to, but the attacker is then able to affix the signature from that certificate to a "BAD" certificate of their own construction, which can refer to any other site. — caf, Jan 19 '12 at 05:19
To elaborate on @caf's comment, the problem occurs when the attacker prepares a "weak" GOOD certificate and a strong (eg arbitrary-cert-signing/delegated) BAD certificate. The CA issues the "weak" single-domain "GOOD" certificate, and the attacker swaps it out for the "powerful" arbitrary-domain-signing BAD certificate. Your statement "if the EVIL CA found its way into the wild, it is provable that it leaked from the person who had the good CA and thus was trusted anyway" is very misleading! They were trusted for a SINGLE DOMAIN THEY CONTROL, but they forged that trust for ALL DOMAINS. — Tao, Jan 13 '19 at 16:43

Accipitridae · Accepted Answer · 2009-11-15T16:23:32.173

11

In cryptography recommendations are not generally made by predicting the future, as this is impossible to do. Rather cryptographers try to evaluate what is already known and published. To adjust for potential future attacks, cryptosystems are generally designed so that there is some safety margin. E.g. cryptographic keys are generally chosen a little bit longer than absolutely necessary. For the same reason algorithms are avoided once weaknesses are found, even if these weaknesses are just certificational.

In particular, the RSA Labs recommended to abandon MD5 for signatures already in 1996 after Dobbertin found collisions in the compression function. Collisions in the compression function do not imply that collisions in the hash function exist, but we can't find collisions for MD5 unless we can find collisions for its compression function. Thus the RSA Labs decided that they no longer have confidence in MD5s collision resistance.

Today, we are in a similar situation. If we are confident that a hash function is collision resistant then we can also be confident that the hash function is preimage resistant. But MD5 has significant weaknesses. Hence many cryptographers (including people like Arjen Lenstra) think that MD5 no longer has the necessary safety margin to be used even in applications that only rely on preimage resistance and hence recommend to no longer use it. Cryptographers can't predict the future (so don't look for papers doing just that), but they can recommend reasonable precautions against potential attacks. Recommending not to use MD5 anymore is one such reasonable precaution.

edited Nov 15 '09 at 16:23

answered May 08 '09 at 06:01

Accipitridae

3,136
19
9

Good points, I guess the bottom line is that if you want pre-image resistance don't use SHA1 (down to 2^52 for collision) or MD5 (of-course an important point is that not all applications need pre-image or collision resistance.) – Sam Saffron May 08 '09 at 06:42
Since SHA-1 is a NIST standard, it will be interesting to see how NIST is handling the new attack. Recommendations for keysizes are in SP 800-57. The March 2007 version grudginly allows SHA-1 for signatures until 2010 but I can't any time restriction for using HMAC SHA-1. Since NIST balances theoretical and practical concerns well, I'm really wondering what they will do. – Accipitridae May 08 '09 at 16:14
MD5 does not have a compression stage, you are confusing it with RC5 in your second paragraph. – Simeon Pilgrim Jun 22 '09 at 23:43
1

@Simeon: MD5 is a hash function that is based on the Davis-Meyer construction. Any such construction uses a compression function. The security of this compression function is very important for the security of the overall hash function. – Accipitridae Nov 15 '09 at 16:27
OP asked "how likely"; you said "we can't predict the future and we need safety margins". Have preimage attacks been found yet, or not? (This question is the first Google result for `arbitrary md5 preimage`) – JamesTheAwesomeDude Apr 21 '21 at 19:44

bdonlan · Answer 2 · 2009-05-05T00:14:31.027

3

We don't know.

This kind of advance tends to come 'all of a sudden' - someone makes a theoretical breakthrough, and finds a method that's 2^10 (or whatever) times better than the previous best.

It does seem that preimage attacks might still be a bit far off; a recent paper claims a complexity of 2^96 for a preimage on a reduced, 44-round version of MD5. However, this isn't a question of likelihood but rather whether someone is clever enough to go that final step and bring the complexity for the real deal into a realistic margin.

That said, since collision attacks are very real already (one minute on a typical laptop), and preimage attacks might (or might not) be just around the corner, it's generall considered prudent to switch to something stronger now, before it's too late.

If collisions aren't a problem for you, you might have time to wait for the NIST SHA-3 competition to come up with something new. But if you have the processing power and bits to spare, using SHA-256 or similar is probably a prudent precaution.

edited May 05 '09 at 00:14

answered May 05 '09 at 00:01

bdonlan

224,562
31
268
324

@bdonlan, its a trade off thing, depending on your application you may need protection from collision attacks or not. MD5 is fast and small which makes it useful for a variety of things that do not need collision attack safety. – Sam Saffron May 05 '09 at 00:05
@bdnolan, also is there any logical/theoretical proof that preimage attacks are more likely in wake of the fact that there are collision attacks? – Sam Saffron May 05 '09 at 00:06
@sam: Yes and no. Yes, because collision attacks mean that more is known about structural weaknesses, are more knowledge of course doesn't hurt. No, because collision attacks generally exploit certain kinds of symmetries or interesting bit patterns that most real-life hashes didn't have as an intermediate result. – me22 Jan 12 '11 at 03:48

score 2 · Answer 3 · answered Mar 13 '14 at 20:33

Cryptographically speaking MD5's pre-image resistance is already broken, see this paper from Eurocrypt 2009. In this formal context "broken" means faster than brute force attacks, i.e. attacks having a complexity of less than (2^128)/2 on average. Sasaki and Aoki presented an attack with a complexity of 2^123.4 which is by far only theoretical, but every practical attack is build on less potent theoretical attack, so even a theoretical break casts serious doubts on its medium-term security. What is also interesting is that they reuse a lot of research that has gone into collision attacks on MD5. That nicely illustrates Accipitridae's point that MD5's safety margin on pre-image resistance is gone with the collision attacks.

Another reason why the use of MD5 in 2009 has been and now the use of SHA1 is strongly discouraged for any application is that most people do not understand which exact property the security of their use case relies on. You unfortunately proved my point in your question stating that the 2008 CA attack did not rely on a failure of collision resistance, as caf has pointed out.

To elaborate a bit, every time a (trusted) CA signs a certificate it also signs possibly malicious data that is coming from a customer in form of a certificate signing request (CSR). Now in most cases all the data that is going to be signed can be pre-calculated out of the CSR and some external conditions. This has the fatal side effect that the state the hash function will be in, when it is going to hash the untrusted data coming out of the CSR is completely known to the attacker, which facilitates a collision attack. Thus an attacker can precompute a CSR that will force the CA to hash and sign data that has a collision with a shadow certificate only known to the attacker. The CA cannot check the preconditions of the shadow certificate that it would usually check before signing it (for example that the new certificate does not claim to be a root certificate), as it only has access to legitimate CSR the attackers provided. Generally speaking, once you have collision attacks and part of your data is controlled by an attacker then you no longer know what else you might be signing beside the data you see.

I wonder if we'll get a slot for CA-chosen nonces to mitigate this, at some point — JamesTheAwesomeDude, Apr 21 '21 at 22:50
CABF BRs require the serial number to contain at least 64 bits of cryptographically-random data, which effectively makes that field a CA-chosen nonce. — womble, May 29 '23 at 00:14

Does any published research indicate that preimage attacks on MD5 are imminent?

3 Answers3

Linked