2

I have very basic question related to user management and in particular storing hashed passwords. I read few pages (like https://wiki.python.org/moin/Md5Passwords ). The way I understand hashing is this:

  • password provided by user is hashed (with whatever function) one way.
  • nobody (including user/admin) is able to see the password.
  • when user logs in - the string provided by him is hashed to see if it matches stored hashed password.

That's all clear, however I am not sure what with 'salt' in hashing. I read os.urandom (Python) is good to create good salt: https://crackstation.net/hashing-security.htm

What I am not sure is how to work with this added "salt" If I hash user password with salt and its one way. The next time when user log in he knows only password and not salt. From this I assume that "salt" generated for this user needs to be stored somewhere. Otherwise it will not make sense. But on the other hand if somebody gets access to DB then will see "salt" and hashed password. In such case "salt" does not add much value (its pretty much the same as hashing pure password). So maybe the "salt" is just to prevent protection on front end (against brute force).

Can somebody provide me a hint how to work with salt? Is my understanding correct. Do I need to store "salt" somewhere?

Before I posted this question I found this: Should the Salt for a password Hash be "hashed" also?

what is the added value of the salt? if I write web service I can block each log in after 3 failed attempts. Nobody on the front end is able to see hashed values. Nobody can use brute force (this might be only DoS since 3 failed log ins will block user). The hacker will need have access to DB and see hashed passwords. But if he has, he will see "salt".

1615903
  • 32,635
  • 12
  • 70
  • 99
rdr
  • 49
  • 1
  • 10
  • The salt is stored as well in the database. – Willem Van Onsem Jan 04 '18 at 12:55
  • 1
    The reason why it is useful to use salt, is to prevent hackers from using precalculated hashes. For instance it is easy to get a list of the top 100'000 used passwords, together with the md5 hash. – Willem Van Onsem Jan 04 '18 at 12:57
  • 1
    MD5 is broken. You must not use it for password hashes, with or without salt. – Daniel Jan 04 '18 at 13:01
  • that's clear. but how hacker can see hashed version and use rainbow tables without having access to DB ? And if he has access to DB what is the value of "salt" if it will be stored there? That's something I do not understand. Brute force attack can be prevented by lock after some number of failed attempts – rdr Jan 04 '18 at 13:01
  • 2
    @rdr: usually the idea of hashing is to prevent a hacker that *has* access to the db from stealing passwords. – Willem Van Onsem Jan 04 '18 at 13:07
  • 1
    @rdr The salt prevents rainbow tables from being useful. (In short, if `password` hashes to `12345`, then an attacker can just check for the hash `12345` in the stolen DB and know the password immediately when no salt is used. But if a salt is used, then the stored hash will be of `passwordSALTSALTSALT`, which will not be guessable by the attacker (if everyone uses a different salt).) – ash Jan 04 '18 at 13:18
  • @Josh : thanks. when I think about it I think in such case there is added value. So what is good (reasonable) length of salt? does it play any role? What is the typical number of characters for rainbow tables? I guess its only matter of size. They can also have all hashes for combination of 1-100 characters. in such case if I force at least 10 characters for password I will need to add at least 90+ characters for salt. right? – rdr Jan 04 '18 at 13:59
  • @rdr: 100 characters? If we only consider alphanum chars, then this would yield a space of 1.73e179 possibilities (with all characters ~6e179). You do not need that much characters, the password space blows up exponentially in the number of characters. – Willem Van Onsem Jan 04 '18 at 17:12

1 Answers1

2

Salt is used to prevent a hacker from reversing the password hashes into passwords. So here we assume that somewhow the hacker has access to the database.

Without salt

Let us first assume the scenario without salt. In that case the table looks like:

user | md5 password (first 6 chars)
-------------------------------
   1 | 1932ff
   2 | d3b073

(we here make the situation simpler than it is in reality)

The hacker of course wants to know what the passwords behind d3b073 and 1932ff are. A hash function is one directional in the sense that we can hash a password very fast, but unhashing it will - given it is a good hashing function - take a very long time, after guessing a huge amount of passwords.

So there is not much hope to easily retrieve the possible password(s) behind d3b073. But we can easily find a list of the 100'000 most popular passwords, and calculate the MD5 hash of all these passwords. Such list could look like:

password | md5 (first 6 characters)
--------------------------------------------
foo      | d3b073
bar      | c157a7

So apparently user 2 has used foo as password. The password of user 1 is unknown to us (but we know it is not foo or bar).

Now the point is that we can construct such table once and then use it to crack all passwords of all the users. Constructing such table for 100'000 passwords might perhaps take a few hours, but then we can easily retrieve all passwords. So a hacker can construct (or download) such table (there are more efficient ways, for instance with rainbow tables), and then use it each time he/she hacks a website and then obtains the passwords of all users.

With salt

If we however use salting, the table could look like this:

user | salt   | hashed password
-------------------------------
   1 | a91f40 | 1a604e
   2 | c2a67c | b36232

So here if the password of user 2 is foo, then we calculate the hash of fooc2a67c (or we use another way to combine the salt and the password) and store this into the database.

The point is that it is very hard to guess the password, since b36232 is not the hash of foo, but of fooc2a67c and the salt is typically something (pseudo)-random. We can of course again construct the most popular 100'000 passwords with salt c2a67c appended to it, but since we can not know the salt in advance, we can not create this table only once. Even if we are lucky and already constructed the table for salt c2a67c, it will not help us with hacking the password of user 1, since user 1 has a different salt.

So the only way to resolve this, is by constructing a reverse hash lookup table, for every user. Since it is usually very expensive to construct such table once, it will not be easy to calculate such table for every user.

We might of course decide to calculate all hashes of all possible salts, like for instance:

password  | md5 (first 6 characters)
---------------------------------------------
foo000000 | 367390
foo000001 | eca8ea
foo000002 | 6eb7bf
foo000003 | 7906b1
foo000004 | 0e9f0c
foo000005 | 0bfb11
...       | ...

But as you can see, the size of such table would grow to gigantic sizes. Furthermore it would take thousands of years. Even if we add only one hexadecimal character as salt, the size of the table would scale 16 times. Yes there are some techniques to reduce the amount of time and space for such table, but by increasing the "password space", the problem to hack passwords, will definitely be much harder. Furthermore salt is usally a signifcant amount of characters (or bytes) long making it way more harder than just 16 times more.

Basically salt acts as a way to enlarge the password space. Even if you enter the very same password on two websites, the personal salt of the websites will (close to certainty) be unique, and therefore the hash will be unique as well.

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555