93

UPDATE: I recently learned from this question that in the entire discussion below, I (and I am sure others did too) was a bit confusing: What I keep calling a rainbow table, is in fact called a hash table. Rainbow tables are more complex creatures, and are actually a variant of Hellman Hash Chains. Though I believe the answer is still the same (since it doesn't come down to cryptanalysis), some of the discussion might be a bit skewed.
The question: "What are rainbow tables and how are they used?"


Typically, I always recommend using a cryptographically-strong random value as salt, to be used with hash functions (e.g. for passwords), such as to protect against Rainbow Table attacks.

But is it actually cryptographically necessary for the salt to be random? Would any unique value (unique per user, e.g. userId) suffice in this regard? It would in fact prevent using a single Rainbow Table to crack all (or most) passwords in the system...
But does lack of entropy really weaken the cryptographic strength of the hash functions?


Note, I am not asking about why to use salt, how to protect it (it doesn't need to be), using a single constant hash (don't), or what kind of hash function to use.
Just whether salt needs entropy or not.


Thanks all for the answers so far, but I'd like to focus on the areas I'm (a little) less familiar with. Mainly implications for cryptanalysis - I'd appreciate most if anyone has some input from the crypto-mathematical PoV.
Also, if there are additional vectors that hadn't been considered, that's great input too (see @Dave Sherohman point on multiple systems).
Beyond that, if you have any theory, idea or best practice - please back this up either with proof, attack scenario, or empirical evidence. Or even valid considerations for acceptable trade-offs... I'm familiar with Best Practice (capital B capital P) on the subject, I'd like to prove what value this actually provides.


EDIT: Some really good answers here, but I think as @Dave says, it comes down to Rainbow Tables for common user names... and possible less common names too. However, what if my usernames are globally unique? Not necessarily unique for my system, but per each user - e.g. email address.
There would be no incentive to build a RT for a single user (as @Dave emphasized, the salt is not kept secret), and this would still prevent clustering. Only issue would be that I might have the same email and password on a different site - but salt wouldnt prevent that anyway.
So, it comes back down to cryptanalysis - IS the entropy necessary, or not? (My current thinking is it's not necessary from a cryptanalysis point of view, but it is from other practical reasons.)

Luke Girvin
  • 13,221
  • 9
  • 64
  • 84
AviD
  • 12,944
  • 7
  • 61
  • 91
  • I must say I'm confused by many of the answers below. The main point of using salt is simply to prevent rainbow table attack so anything unique to user should do as it forces the attacker to recreate a rainbow table for each salt. my 2c. – Goran Feb 11 '09 at 13:31
  • If salt is needed for eg. site, you often have ids in urls. If attacker knows (and we assume he does) that password salt is userid+username it's fairly easy to modify attack to avoid salt value. – dmajkic Feb 11 '09 at 14:01
  • @dmajkic, salt is intended to prevent RT attacks and differentiate same passwords for different users. This is a given, even with usernames. – AviD Feb 11 '09 at 14:30
  • @AviD: That's true. But if I know my hash for my pass, and I know that salt is my username then I can easly create hashed pass value for any dictionary word, and compare it to someone else pass hash. That is why time is better than username. – dmajkic Feb 11 '09 at 14:42
  • ( not trained ) - Traditionally, trying to better what has been achieved is error prone - at a minimum such may distract from other issues. From what I see, the only issue is to insure the salt is of sufficient length to get things going - then it's pretty much algorithm strength from there on, no? – Nicholas Jordan Sep 27 '09 at 21:10
  • See also: http://stackoverflow.com/questions/1645161/salt-generation-and-open-source-software/1645190#1645190 – Jacco Dec 26 '09 at 13:11
  • 1
    Just found an article on the PHP Security Consortium website that in its example uses a md5 hashed *random* number as the salt http://phpsec.org/articles/2005/password-hashing.html – helloworlder Jan 03 '10 at 13:41

9 Answers9

160

Salt is traditionally stored as a prefix to the hashed password. This already makes it known to any attacker with access to the password hash. Using the username as salt or not does not affect that knowledge and, therefore, it would have no effect on single-system security.

However, using the username or any other user-controlled value as salt would reduce cross-system security, as a user who had the same username and password on multiple systems which use the same password hashing algorithm would end up with the same password hash on each of those systems. I do not consider this a significant liability because I, as an attacker, would try passwords that a target account is known to have used on other systems first before attempting any other means of compromising the account. Identical hashes would only tell me in advance that the known password would work, they would not make the actual attack any easier. (Note, though, that a quick comparison of the account databases would provide a list of higher-priority targets, since it would tell me who is and who isn't reusing passwords.)

The greater danger from this idea is that usernames are commonly reused - just about any site you care to visit will have a user account named "Dave", for example, and "admin" or "root" are even more common - which would make construction of rainbow tables targeting users with those common names much easier and more effective.

Both of these flaws could be effectively addressed by adding a second salt value (either fixed and hidden or exposed like standard salt) to the password before hashing it, but, at that point, you may as well just be using standard entropic salt anyhow instead of working the username into it.

Edited to Add: A lot of people are talking about entropy and whether entropy in salt is important. It is, but not for the reason most of the comments on it seem to think.

The general thought seems to be that entropy is important so that the salt will be difficult for an attacker to guess. This is incorrect and, in fact, completely irrelevant. As has been pointed out a few times by various people, attacks which will be affected by salt can only be made by someone with the password database and someone with the password database can just look to see what each account's salt is. Whether it's guessable or not doesn't matter when you can trivially look it up.

The reason that entropy is important is to avoid clustering of salt values. If the salt is based on username and you know that most systems will have an account named either "root" or "admin", then you can make a rainbow table for those two salts and it will crack most systems. If, on the other hand, a random 16-bit salt is used and the random values have roughly even distribution, then you need a rainbow table for all 2^16 possible salts.

It's not about preventing the attacker from knowing what an individual account's salt is, it's about not giving them the big, fat target of a single salt that will be used on a substantial proportion of potential targets.

Dave Sherohman
  • 45,363
  • 14
  • 64
  • 102
  • Agreed. I'd put more emphasis on re-use of usernames, since I think that's the attack most likely to be successful against hypothetical systems using the username as a salt, that would have failed against a system using random salt. – Steve Jessop Feb 11 '09 at 14:00
  • I appreciate your point on cross-system security. Also, I guess that in the future its not unlikely to see user-specific rainbow tables... – AviD Feb 11 '09 at 14:13
  • @AviD: You think so? That's quite a specific attack vector. – David Grant Feb 11 '09 at 14:17
  • No, if for example you want to have compromised accounts to send spam you don't mind what the account name is. – Georg Schölly Feb 11 '09 at 14:24
  • @gs: I meant that using the username as salt is quite a specific vector. – David Grant Feb 11 '09 at 14:28
  • @MrPotatoHead, its not a vector, thats the question that I'm asking - is it good enough? Cross-system security is one point to consider. – AviD Feb 11 '09 at 14:31
  • In this case, is it worthwhile to invest in the additional time of a cryptographically secure random number generator? Any RNG will prevent user login collisions. – Nona Urbiz Sep 25 '09 at 15:03
  • 2
    Not for generating salts, no. The only attacks which are affected by salt are those made directly against the hashed passwords. An attack can't make this sort of attack unless they have a copy of your password database, which will also contain the salts. Since the attacker will already have the salts in his possession, they only require sufficient entropy to avoid having multiple passwords hashed with the same salt, which any basic (P)RNG will provide. – Dave Sherohman Sep 26 '09 at 09:24
  • Third sentence should start "An attack**er** can't make...". – Dave Sherohman Sep 26 '09 at 09:26
  • So is salt prefixed to a password secure? for example if i store the password as `passhash = md5("urhu2389udfcbdvalk" + UserPassword)` would that be secure? –  Jul 21 '11 at 16:55
  • 3
    @acidzombie24: Using a single fixed salt (usually called a "nonce" in my experience) for all passwords is less secure than using a unique random salt for each password, as it still allows an attacker to trivially determine whether two or more accounts share the same password. On the other hand, the nonce would likely be stored in your code rather than in the database, so an attacker who *only* has your database won't have access to it; because of this, the most secure option is to use both a system-wide nonce (stored in the code) and a per-password salt (stored in the database). – Dave Sherohman Aug 08 '11 at 11:48
  • +1 for the last paragraph. It made me understand needing salt-per-user rather than single-use-salt. – Yoshiyahu Mar 05 '12 at 22:12
  • @DaveSherohman very informative, based on this advice, I found a very nice Java library that does what you've suggested: http://www.jasypt.org/encrypting-passwords.html – JMac Sep 12 '13 at 00:36
  • Using usernames also creates an unnecessary coupling with the password so you cannot create a feature of "change username" without asking for password anymore. – Sedat Kapanoglu Dec 15 '14 at 10:03
31

Using a high-entropy salt is absolutely necessary to store passwords securely.

Take my username 'gs' and add it to my password 'MyPassword' gives gsMyPassword. This is easily broken using a rainbow-table because if the username hasn't got enough entropy it could be that this value is already stored in the rainbow-table, especially if the username is short.

Another problem are attacks where you know that a user participates in two or more services. There are lots of common usernames, probably the most important ones are admin and root. If somebody created a rainbow-table that have salts with the most common usernames, he could use them to compromise accounts.

They used to have a 12-bit salt. 12 bit are 4096 different combinations. That was not secure enough because that much information can be easily stored nowadays. The same applies for the 4096 most used usernames. It's likely that a few of your users will be choosing a username that belongs to the most common usernames.

I've found this password checker which works out the entropy of your password. Having smaller entropy in passwords (like by using usernames) makes it much easier for rainbowtables as they try to cover at least all passwords with low entropy, because they are more likely to occur.

Georg Schölly
  • 124,188
  • 49
  • 220
  • 267
  • +1 for a good point, however, you *are* talking about a potentially *massive* rainbow table. To cover off all values of gsMyPassword length (assuming mixed-case alphanumeric) requires 36^12 rows in the table! – David Grant Feb 11 '09 at 13:56
  • As a follow up: the distributed rainbow table project has 63,970 cracked hashes, but 36^12 is 4,738,381,338,321,616,896! – David Grant Feb 11 '09 at 13:57
  • Thanks, and this does make sense - but as Mr.PotatoHead pointed out, you're still expanding your space beyond reasonable possiblity of cracking with RT. Also, assuming my usernames are at least 6-8 characters - what additional necessary value does entropy provide? – AviD Feb 11 '09 at 14:04
  • @Potato: you assume a rainbow table containing all possible combinations of alphanumeric characters. This doesnt need to be the case, an effective rainbow table will contain hashes for common passwords/hashes. Salt is there to protect against dictionary attacks or combined rainbowtable/dictionary. – Guillaume Feb 11 '09 at 14:06
  • @ Mr Potate Head: My computer can generate 1000 hashes per second, 63,970 hashes doesn't say how many hashes the distributed rainbow table has stored. – Georg Schölly Feb 11 '09 at 14:23
  • @gs: Since I would imagine the project has more than one minute's CPU time, I guess my figure is wrong. :P – David Grant Feb 11 '09 at 14:30
  • @ Mr Potate Head: It says something about "10181 million chains completed", how many hashes might that be? – Georg Schölly Feb 11 '09 at 14:34
  • @gs: Not sure - I've seen something about chains, but more interesting are the databases - they go up to 8 mixed-alphanumeric characters! – David Grant Feb 11 '09 at 14:57
  • 3
    Using the username as the salt is also a bad idea for systems where there is a predictably-named high-privilege account: "Administrator" on Windows, "root" on *nix, "sa" in MSSQL, etc. – nobody Feb 11 '09 at 23:21
  • However, in most applications / business systems, there is no builtin "Administrator", "root", etc. – AviD Feb 13 '09 at 00:59
  • "High entropy salt" isn't bigger search space because "the salt part" is always known. Let's say you have 4096-bit salt and 8-bit password. As long as you know the salt, your search space is just 256. – Sedat Kapanoglu Dec 15 '14 at 10:12
8

It is true that the username alone may be problematic since people may share usernames among different website. But it should be rather unproblematic if the users had a different name on each website. So why not just make it unique on each website. Hash the password somewhat like this

hashfunction("www.yourpage.com/"+username+"/"+password)

This should solve the problem. I'm not a master of cryptanalysis, but I sure doubt that the fact that we don't use high entropy would make the hash any weaker.

konne
  • 81
  • 1
  • 1
  • I believe you're right that this is sufficient. However, given that hashes are a complicated mathematical field and it's very well possible that low entropy salts make the hashes more easily to guess in the future, (especially as hashes get broken), I wouldn't bet on the security, when the proven more secure solution is not much more expensive. – Georg Schölly Apr 10 '11 at 13:53
7

I like to use both: a high-entropy random per-record salt, plus the unique ID of the record itself.

Though this doesn't add much to security against dictionary attacks, etc., it does remove the fringe case where someone copies their salt and hash to another record with the intention of replacing the password with their own.

(Admittedly it's hard to think of a circumstance where this applies, but I can see no harm in belts and braces when it comes to security.)

teedyay
  • 23,293
  • 19
  • 66
  • 73
  • I like the idea of binding the password to the user. But if the attacker can change the password he can surely also change the id. – Georg Schölly Feb 11 '09 at 15:19
  • Depends on what the ID is: I use the primary key of the table, which you can't really change at will. TBH, if the hacker's writing to your database, you're in some pretty big trouble already... – teedyay Feb 11 '09 at 16:03
  • 3
    Nah, I like it. An attacker is often an "insider". I could dream up scenarios where what he's after isn't in the database, but if he can overwrite the credentials in the database, he can authenticate himself to do what he wants. – erickson Feb 11 '09 at 21:07
  • 1
    Good point. We do find it makes it increasingly difficult to add the first user to a new database: we've effectively made our applications so secure we can barely hack them ourselves. [Also, use an application-specific salt so you can't copy credentials from one database to another.] – teedyay Feb 12 '09 at 09:35
  • Finally I found someone saying this: adding something user-specific to the salt. That avoids an intruder to copy some credentials to another user, and also prevents a hacker to guess any password if he manages to reach your hash and salt field in the table. One thing more I like to do: not using a round number of iterations in the hashing. Don't go for 10k or 20k, but rather something like 19835. – Andrew Jul 12 '16 at 06:34
3

The strength of a hash function is not determined by its input!

Using a salt that is known to the attacker obviously makes constructing a rainbow table (particularly for hard-coded usernames like root) more attractive, but it doesn't weaken the hash. Using a salt which is unknown to the attacker will make the system harder to attack.

The concatenation of a username and password might still provide an entry for an intelligent rainbow table, so using a salt of a series pseudo-random characters, stored with the hashed password is probably a better idea. As an illustration, if I had username "potato" and password "beer", the concatenated input for your hash is "potatobeer", which is a reasonable entry for a rainbow table.

Changing the salt each time the user changes their password might help to defeat prolonged attacks, as would the enforcement of a reasonable password policy, e.g. mixed case, punctuation, min length, change after n weeks.

However, I would say your choice of digest algorithm is more important. Use of SHA-512 is going to prove to be more of a pain for someone generating a rainbow table than MD5, for example.

David Grant
  • 13,929
  • 3
  • 57
  • 63
  • The strength of the function does not change, but the output definitely changes. If the input can be influenced or known, then perhaps something can be deduced about the hash value. That's the risk. – Martin Carpenter Feb 11 '09 at 12:49
  • @Martin: The only thing you should be able to deduce from the hash value if whether or not you have a match! Putting "roota" or "rootb" (where "a" and "b" represent the password) into a hash function will give you radically different output. – David Grant Feb 11 '09 at 13:14
  • The salts are known to an attacker using rainbow-tables. – Georg Schölly Feb 11 '09 at 13:45
  • The server has to know the unencrypted salt (to check the password against the hash), thus one can take for granted that an attacker has access to the salt, or can get it very easily. – Georg Schölly Feb 11 '09 at 14:31
  • Right, right, thats a given - to be more specific I meant the strength of the overall cryptosystem (or -subsystem, wrt hashes). – AviD Feb 11 '09 at 14:35
  • @AviD: Since you a username + password concatenation might still give you an entry from a intelligent rainbow table, I would say it was better to use a random series of characters as your salt, and store the salt in the same place. If you like, you could change the salt each time the user logs in. – David Grant Feb 11 '09 at 14:39
3

If the salt is known or easily guessable, you have not increased the difficulty of a dictionary attack. It even may be possible to create a modified rainbow table that takes a "constant" salt into account.

Using unique salts increases the difficulty of BULK dictionary attacks.

Having unique, cryptographically strong salt value would be ideal.

HUAGHAGUAH
  • 1,063
  • 6
  • 3
  • 2
    The whole point of the rainbow table is that the values are pre-generated, and that if you're generating the table for a value (e.g. password) PLUS a given constant, that is a entirely different table, and you might as well just try the hash outcome directly. :) – David Grant Feb 11 '09 at 13:42
  • 1
    A constant hash is worth nothing. All passwords can be cracked using a single rainbow-table. The purpose of the salt is that it's impossible to create a rainbow-table. – Georg Schölly Feb 11 '09 at 13:46
  • 1
    @gs - I don't see why you can't construct a rainbow table that "includes" the effects of a constant salt. The attack space (the password) won't have changed, so the table won't need to grow, just be recalculated. – HUAGHAGUAH Feb 12 '09 at 04:46
  • @Potato - depends on the tradeoff. If your company's 20k logins were all hashed with the same constant salt, it may be cheaper to compute a fresh rainbow table than to dictionary attack them all individually. Hence my emphasis of "BULK". – HUAGHAGUAH Feb 12 '09 at 04:49
3

I would say that as long as the salt is different for each password, you will probably be ok. The point of the salt, is so that you can't use standard rainbow table to solve every password in the database. So if you apply a different salt to every password (even if it isn't random), the attacker would basically have to compute a new rainbow table for each password, since each password uses a different salt.

Using a salt with more entropy doesn't help a whole lot, because the attacker in this case is assumed to already have the database. Since you need to be able to recreate the hash, you have to already know what the salt is. So you have to store the salt, or the values that make up the salt in your file anyway. In systems like Linux, the method for getting the salt is known, so there is no use in having a secret salt. You have to assume that the attacker who has your hash values, probably knows your salt values as well.

Kibbee
  • 65,369
  • 27
  • 142
  • 182
  • 3
    "The point of the salt, is so that you can't use standard rainbow table to solve every password in the database". I disagree. The point is so that you can't use a standard rainbow table to solve *any* password. If the salt is the username, then the rainbow table for "root" could crack root's pw. – Steve Jessop Feb 11 '09 at 13:57
  • That's probably about the only rule that really matters. Basically you want the password to be something that's unique on your system, but it doesn't matter what it is. It could still be something simple though. You don't need a whole lot of entropy. – Kibbee Feb 11 '09 at 14:06
  • This was my thinking too - but I got stuck at the "probably okay". I still don't see this theory being proved - or at least verified that there are no cryptanalysis implications. – AviD Feb 11 '09 at 14:38
  • @onebyone: If the salt consists of username + constant local value (I often use ':'), then that still ruins standardized RTs, unless there are a variety of them that include common usernames and common static salt values. I rather suspect that's not the case. – nsayer Feb 11 '09 at 17:39
  • Along with nsayer. You could use a system-wide constant string, that there wouldn't be a pregenerated RT for, like #(D83d8, along with the user name as the salt, and that would probalby prevent any rainbow table attacks. – Kibbee Feb 12 '09 at 16:32
1

Salt should have as much entropy as possible to ensure that should a given input value be hashed multiple times, the resulting hash value will be, as close as can be achieved, always different.

Using ever-changing salt values with as much entropy as possible in the salt will ensure that the likelihood of hashing (say, password + salt) will produce entirely different hash values.

The less entropy in the salt, the more chance you have of generating the same salt value, as thus the more chance you have of generating the same hash value.

It is the nature of the hash value being "constant" when the input is known and "constant" that allow dictionary attacks or rainbow tables to be so effective. By varying the resulting hash value as much as possible (by using high entropy salt values) ensures that hashing the same input+random-salt will produce many different hash value results, thereby defeating (or at least greatly reducing the effectiveness of) rainbow table attacks.

CraigTP
  • 44,143
  • 8
  • 72
  • 99
  • Again, I'm not referring to using a single constant salt - rather a non-random, but user-unique value. E.g. userId. – AviD Feb 11 '09 at 14:36
  • The whole point is that your salt value should never be constant. Every single thing you hash, whether the same "input" value or not should have a different salt. Dave Sherohman explains the downsides of using even user-unique values that essentially stay the same with each hash computation. – CraigTP Feb 11 '09 at 15:03
-1

Entropy is the point of Salt value.

If there is some simple and reproducible "math" behind salt, than it's the same as the salt is not there. Just adding time value should be fine.

dmajkic
  • 3,448
  • 1
  • 18
  • 24
  • I don't understand this comment. You say "use entropy", and then: "use time"?? – Martin Carpenter Feb 11 '09 at 12:46
  • if Username is used for salt, it' not entropic enough. But if you use username+current dateime - it is, since it's hard to guess when exactly salt was created. – dmajkic Feb 11 '09 at 13:29
  • "entropic enough" is subjective; it's your standard. Basing this value on time may not be adequate. – Martin Carpenter Feb 11 '09 at 13:44
  • There is no such a thing as "absolute true randomness". It's on us to say when it's "good enough". If you think that in this case time is not good enough, use something else. http://stackoverflow.com/questions/84556/whats-your-favorite-programmer-cartoon/90391#90391 – dmajkic Feb 11 '09 at 13:55
  • Actually, the point of salt is to cause each hash result to be unique - in order to prevent (a) rainbow table attacks, and (b) identical password identification. The question is, what else is entropy needed for, if at all? – AviD Feb 11 '09 at 14:28
  • Imagine a system that creates batches of accounts (random passwords) on the first day of each month. Imagine it uses a low granularity time for salt (or is a really fast system that creates hundreds of accounts per time unit). Now we're vulnerable to both AviD's (a) and (b). – Martin Carpenter Feb 12 '09 at 14:15
  • @AviD: and if we turn your question around: When is it bad to use high-entropy salt? Extra CPU cycles? Predictable PRNG? – Martin Carpenter Feb 12 '09 at 14:21
  • @Martin, the problem isnt actually one of high-entropy - rather the "complexity" of *any* additional information, seeing as how we already have a unique username its a hard to convince them of the necessity... – AviD Feb 13 '09 at 01:04