7

I have the following code snippet:

from passlib.context import CryptContext

pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
pwd_context.hash(password)

Which is described here.

What i don't understand is, how can this be secure if it returns the same hashed password all the time without considering another secret_key for example to hash the password value?

supersick
  • 261
  • 2
  • 14

1 Answers1

13

Your assumption that it returns the same hashed password all the time without considering another "secret" (well, it's not really secret) is wrong; you'll see this if you run pwd_context.hash multiple times:

>>> from passlib.context import CryptContext
>>>
>>> pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
>>> pwd_context.hash("test")
'$2b$12$0qdOrAMoK7dgySjmNbyRpOggbk.IM2vffMh8rFoITorRKabyFiElC'
>>> pwd_context.hash("test")
'$2b$12$gqaNzwTmjAQbGW/08zs4guq1xWD/g7JkWtKqE2BWo6nU1TyP37Feq'

These two hashes are, as you can see, not the same - even when given the same password. So what's actually going on?

When you don't give hash an explicit salt (the secret "key" you're talking about) one will be generated for you by passlib. It's worth pointing out that hashing is NOT the same as encryption, so there is no key to talk about. Instead you'll see salt mentioned, which is a clear text value that is used to make sure that the same password hashed twice will give different results (since you're effectively hashing salt + password instead).

So why do we get two different values? The salt is the first 22 characters of the actual bcrypt value. The fields are separated by $ - 2b means bcrypt, 12 means 12 rounds, and the next string is the actual resulting value stored for the password (salt+resulting bcrypt hash). The first 22 characters of this string is the salt in plain text.

You can see this if you give bcrypt a salt instead of letting it generate one (the last character has to be one of [.Oeu] to match the expected bitpadding of some bcrypt implementations - passlib will otherwise throw an error or a warning - the other characters has to match the regex character class of [./A-Za-z0-9]):

>>> pwd_context.hash("test", salt="a"*21 + "e")
'$2b$12$aaaaaaaaaaaaaaaaaaaaaehsFuAEeaAnjmdgkAxYfzHEipCaNQ0ES'
        ^--------------------^

If we explicitly give the same hash, the result should be the same (and is how you can verify the password later):

>>> pwd_context.hash("test", salt="a"*21 + "e")
'$2b$12$aaaaaaaaaaaaaaaaaaaaaehsFuAEeaAnjmdgkAxYfzHEipCaNQ0ES'
>>> pwd_context.hash("test", salt="a"*21 + "e")
'$2b$12$aaaaaaaaaaaaaaaaaaaaaehsFuAEeaAnjmdgkAxYfzHEipCaNQ0ES'

This same is the case for the previous hashes:

>>> pwd_context.hash("test")
'$2b$12$gqaNzwTmjAQbGW/08zs4guq1xWD/g7JkWtKqE2BWo6nU1TyP37Feq'
        ^--------------------^

This is the actual generated salt, which is then used together with test to create the actual hash:

>>> pwd_context.hash("test")
'$2b$12$gqaNzwTmjAQbGW/08zs4guq1xWD/g7JkWtKqE2BWo6nU1TyP37Feq'
                              ^-----------------------------^

So why do we use this salt when it's clearly visible to everyone? It makes it impossible to just scan through the a list of hashes for known hashes - since test in your list will have a different values than test in the list you're comparing it to (because of different salts), you'll have to actually test the guessed passwords together with their salt and run them through the hashing algorithm. bcrypt is explicitly designed to make that process take time, so you'll spend far longer trying to crack a password than just scan through a list of 200 million passwords and search for the known hash in a database.

It'll also make sure that two users with the same password won't receive the same password hash, so you can't quickly determine weak passwords by looking for password hashes that repeat among multiple users (or try to determine if two users is the same individual because they have the same password).

So what do you do when computers gets even faster? You increase the 12 parameter - the rounds - this increases the runtime of the hashing alogrithm, hopefully staying safer for even longer (you can experiment with the rounds parameter to passlib.hash).

MatsLindh
  • 49,529
  • 4
  • 53
  • 84
  • 1
    Hi there, great explanation. Thank you!!! One last question that still bothers me... Assuming that passlib generates a salt for me, then how is it possible that i can run the same thing again from another pc without specifying a salt (so a new one will generated) but it will still be able to know if the plain text is the same value as the hashed one? – supersick Feb 25 '22 at 05:54
  • I touched on that in the last paragraph; since you know all the necessary parts when verifying a password (the password, the salt and the hash), you can supply all the necessary parts. When verifying you use the existing salt and do not generate a new one; you use the one stored in the string returned from `hash` (for bcrypt, the first 22 characters). You extract the salt from the string, then give that as the `salt` parameter (don't do it manually except for when playing around with this to learn - otherwise use `passlib.verify` that will extract the salt and do the comparison The Right Way) – MatsLindh Feb 25 '22 at 10:52
  • @MatsLindh thanks for taking the time to write this detailed explanation, however I find parts of the answer a bit confusing. You said, "The salt is the first 22 characters of the actual bcrypt value." and then later you said "The first 22 characters of this string is the hash.", did u mean to say `salt` instead of `hash` in the second sentence? In the password hash examples you gave, for eg '$2b$12$aaaaaaaaaaaaaaaaaaaaaOm/4kNFO.mb908CDiMw1TgDxyZeDSwum', none of the hashes have a salt length of 22, in above example 'aaaaaaaaaaaaaaaaaaaaa' has a length of 21. Are these typos(same for all egs)? – lordvcs May 22 '23 at 18:43
  • 1
    @lordvcs The length difference is related to the part mention about the passlib warning for padding bits; this occurs if the last character in the salt isn't one of `[.Oeu]`. I'll add a bit more details about that. And yes, the second sentence about 22 characters should reference the salt, not the hash. The answer has now been update to address all your concerns :-) – MatsLindh May 22 '23 at 18:50