8

In a Linux system, passwords are stored using an MD5 hash. Why can the usage of "salt" protect the system more? Particularly, I want to make clear the following two

  1. The salt is said to be stored in clear text with the hash, then how it can prevent the attacker when the attacker knows the salt value. (Attacker can be the system administrator himself who can check /etc/shadow.
  2. If the salt is generated randomly everytime, how can the system compare the hash to authenticate the user?

For example, User A has user salt s1 and generate h1; h1 = md5(password.s1);. The next time, it uses salt s2 and the system must generate a different hash, h2 = md5(password.s2). Since h1 is not equal to h2, how can the system authenticate the user?

AMadmanTriumphs
  • 4,888
  • 3
  • 28
  • 44
user496949
  • 83,087
  • 147
  • 309
  • 426
  • 3
    Did you read this yet? http://en.wikipedia.org/wiki/Salt_(cryptography) After reading this, please **update** your question to be specific. – S.Lott Feb 19 '11 at 13:26
  • even after reading the above, I cannot fully get it, so ask here. – user496949 Feb 19 '11 at 13:39
  • How can this question NOT be a duplicate? For instance, of http://stackoverflow.com/questions/1219955 (but there are probably older ones). – Peter Mortensen Feb 26 '11 at 21:16
  • Other candidates: [What is the purpose of the “salt” when hashing?](http://stackoverflow.com/questions/1884598/what-is-the-purpose-of-the-salt-when-hashing), [Hash and salt](http://stackoverflow.com/questions/625542/hash-and-salt), [Salt question - using a “random salt”](http://stackoverflow.com/questions/2999197/salt-question-using-a-random-salt) and [Salt, passwords and security](http://stackoverflow.com/questions/2583203/salt-passwords-and-security). – Peter Mortensen Feb 26 '11 at 21:24
  • http://en.wikipedia.org/wiki/Salt_%28cryptography%29 – Bjarni Sævarsson Feb 19 '11 at 13:26

4 Answers4

13

MD5 is a hash as you know, so if you give it an input, like 'PASSWORD', you get a unique (hopefully - however MD5 has collisions these days) output, like '3DE2AF...'.

Now, as you know, it's quite hard to directly reverse that, until somebody thought... wait, why don't I pre-generate all the possible combinations of hashable values until I can reverse the hash. This is called a rainbow table.

The purpose of a salt is to add arbitrary random data to the string being hashed, such that you increase the length of input to hash. This means general rainbow tables that expect to reverse just a password input to a hash won't work. Of course, rainbow tables being just reverse lookups, you could simply generate a rainbow table to compensate for all the possible password+salt outputs. This is where the increase in length comes into its own; because of the nature of reversing hashes, the disk space to generate reverses for very long hash inputs soon becomes infeasible. Alphanumeric rainbow tables for 6-8 characters are already a couple of Gigabytes; increase the length and character classes and you start to talk in multiples of 10GB.

Of course, if you're salting with 'PASSWORD' and you hash 'PASSWORD' you're hashing 'PASSWORDPASSWORD' which isn't that much more secure, so the choice of salt is important too. Ideally, you should use a random salt with each hashed string, but of course, you need to know what it is. A common technique is to derive a salt from the username or some other property unique to this case. Adding arbitrary data isn't in itself useful; having user-determined salt data now adds an additional level of complexity, meaning rainbow tables are needed with specialised searches for each user. The more you make this difficult, the more computational power is needed. That's where the battle is.

However, there are some modern techniques. I am not an expert, so I can't tell you how secure these are, but they are worth a mention. The concept is slow hashing. Basically, through compound hash functions you make it take a while to compute each hash. As such, the ability for each user to check the password now has a constant amount of time added for each password you wish to check. If you're bruteforcing, that is Bad News(tm). Similarly, if the system is well designed, if there are no shortcuts (which probably equate to weaknesses) then generating a rainbow table for a slow hash function should also take a while.

Edit more detail here. See crypt() for the first example of this. @CodeInChaos has referenced PBKDF2 which forms part of PKCS#5. A newer development is scrypt.

As I say, I'm not an expert cryptanalyst. On the latter example, I have no particular specialist knowledge as to its suitability, I'm merely showing you where things are headed.

Edit 2 Clarified my write up of salt - I think I danced around the key issue of disk space before.

Community
  • 1
  • 1
  • While personally I like the idea of scrypt, I haven't seen much analysis done on it. So I'd rather go with one of the older standardized KDFs. The advantage of scrypt only comes into play if you face an attacker who uses custom hardware or perhaps FPGAs. – CodesInChaos Feb 19 '11 at 13:48
  • @CodeInChaos mmm. That's the one I've looked at / talked about most recently so the one that came to mind. But you're right, it hasn't had a chance to be thoroughly reviewed yet. –  Feb 19 '11 at 13:51
  • I am wondering if the salt is stored in the clear text, the attacker can just built a rainbow table based on the specific salt. – user496949 Feb 19 '11 at 13:57
  • @user49 the salt isn't. It's added to what you hash. But yes, if you know or have an idea of how that string is constructed, you could nullify the benefit of the salt for *that specific case*. But that's the thing - it would be on a case-by-case basis. Cryptography/Hashes etc are not absolute security; the idea is to make it as difficult as possible for your attacker. –  Feb 19 '11 at 13:59
  • Usausally the system just append the salt to the password, then do the hash , is this right? Or this mechanism is not too hard to figure it out if I am an system admin. – user496949 Feb 19 '11 at 14:08
  • "I am wondering if the salt is stored in the clear text, the attacker can just built a rainbow table based on the specific salt." He could. But it's be useless since its slower than simple brute-force. The idea of such a table is that you build it once and use it many times. – CodesInChaos Feb 19 '11 at 14:10
  • You'd need to know what's in use on your system, which probably means looking at source code. There are a number of ways to do it; you can prepend, append, or both. I have no idea what's actually in use; I imagine some form of key derivation is. –  Feb 19 '11 at 14:12
  • When we study the security, we are told that the security based on closed algorithm is not secure. The attacker can sooner or later figure out the algorithm like append/prepend or any other and build a larger rainbow table. – user496949 Feb 19 '11 at 14:38
  • CodeInChaos, if the attaker knows the salt, He can build a larger rainbow table? is this right? It just takes him some more time and computation. – user496949 Feb 19 '11 at 14:40
  • @user49 correct. and whatever they're using it'll be in the source code and probably one of a number of publicly-known techniques. And as for the latter, yes, but rainbow tables are huge and take a long time to generate. That's why you want to make it once and reuse it. –  Feb 19 '11 at 14:58
  • Does rainbow table enumerate all the possible passwords? For example, if the system allows 15 digits for password, does rainbow table precalculate all the possible hash for 15 digits? – user496949 Feb 19 '11 at 15:07
  • Depends on the rainbow table. The more characters you have in the potential password, or the more character classes, the larger the table becomes. So the "passwords should be minimum 8 chars, not a dictionary word and contain one of [A-Za-z0-9] + punc" isn't just about dictionary attacks... –  Feb 19 '11 at 15:16
  • @Ninefinger: +1: Interesting answer! As for the MD5 hash of "PASSOWRD", I guess that you meant "8b04b6…", right? :) – Eric O. Lebigot Feb 19 '11 at 15:21
  • Ninefingers, so no one will build rainbow table for you if the system does use salt, right? Salt is basically increase the complexity of the password. – user496949 Feb 19 '11 at 15:26
  • I've given you another answer over here: http://stackoverflow.com/questions/5051608/rainbow-table/5051690#5051690. The idea is probably not. If someone has masses of computational power, time and resources then they could. But it'd be specific to you. –  Feb 19 '11 at 15:44
  • @EOL it depends on how the salt works. You could salt with another md5, but the most trivial form is just to append or prepend or whatever. The idea is that you add extra junk to the string somehow. –  Feb 19 '11 at 15:44
5

You can reverse a simple hash algorithm by brute force.

If you are using a common word for passwords, some prebuild tables (like rainbow ones) might contain them. That's why most algorithms call the hash function several times:

md5(md5(md5(password)));

Using salt gives a lot more of randomness to the generated password and thus make it less guessable. It consists of adding a random piece of string in the process

md5(md5(md5(password+string)+string)+string);
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
vaugham
  • 1,821
  • 1
  • 19
  • 38
  • Good answer but i have another question ! does `md5(md5(md5(password)));` allowed in programming languages ? and can yet the multiple hashed ciphers expose the rainbow and brute force? – Sudantha Feb 19 '11 at 13:30
  • Why shouldn't it ? md5 takes a string as parameter and returns a string. You can call it as many times you want. Reversing each hash costs more time and resources to the attacker than to the hasher. – vaugham Feb 19 '11 at 13:35
  • typcially you don't write something like `md5(md5(...))` yourself but use but use a known and reviewed key derivation function. For example PBKDF2 based on HMAC. And repeatedly calling the hash function doesn't help much against rainbow tables. Many iterations are used to slow down brute-force, and the salt is used to prevent rainbow tables. – CodesInChaos Feb 19 '11 at 13:44
  • md5 is not reversible, but you can find strings pretty easily that hash to the same values making a pretty poor algorithm to use to obfuscate your passwords without using a salt. The salt doesn't give it randomness, but rather makes it so that if you do use a password that matches the hash value when hashed without a salt, it won't match the hash with the salt added. This adds security in that you'd have to know both the hashed value and the salt before you can figure out what value to use from the rainbow table. – tvanfosson Feb 19 '11 at 13:45
  • I should say that it "doesn't give it randomness _unless you use a random salt_" Without a random salt, it's a simple transformation. – tvanfosson Feb 19 '11 at 13:53
  • You typically assume the attacker knows the salt. The salt just prevents pre-built tables. And while md5 is pretty broken and should be avoided, I know of no practical pre-image attack against it. Can you provide a reference to such an attack @tvanfosson – CodesInChaos Feb 19 '11 at 14:12
  • @CodeInChaos - I was only alluding to the fact that the attacker would need to know the salt in order to construct a suitable rainbow table. Without a salt, one can immediately find rainbow tables that would make it trivial to find a collision. – tvanfosson Feb 19 '11 at 14:39
2

One reason could be, if two people use same password unknowingly they will generate same MD5. One of them can just see /etc/shadow and guess other guys password.

Now with salt added to each password, even same passwords generate different hashes.

Zimbabao
  • 8,150
  • 3
  • 29
  • 36
  • If the password is so simple that it generates the same hash by accident it's trivially brute-forced too. – CodesInChaos Feb 19 '11 at 13:45
  • That is if shadow file uses md5, not to mention that those are salted as well. And even if they weren't, unless the OS has seriously messed up permissions, if a person can read shadow you have bigger problem then this issue:) – cyber-guard Feb 19 '11 at 13:50
  • That's true if you use a different salt per password, but not if you share the salt between all passwords. If you use the same salt they will hash to the same value. It can happen, though, that even if you use different passwords they will hash to the same value. In that case using a salt should force them to hash to different values. In any event /etc/shadow should not be readable by anyone but root and if they've got root access you're screwed anyway. – tvanfosson Feb 19 '11 at 13:51
  • @CodeInChaos: hashes for each string are the same, whether it is 3 chars or 30 long, in fact that what allows attackers to bruteforce it... – cyber-guard Feb 19 '11 at 13:52
0

When you encrypt data it can be still attacked by bruce-force attacks and rainbow attacks. In salting, at the end of the encrypted data you add some additional bits. So the attacker cannot get the original data properly.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Sudantha
  • 15,684
  • 43
  • 105
  • 161