2

A text s has been encrypted with:

s2 = iv + Crypto.Cipher.AES.new(Crypto.Hash.SHA256.new(pwd).digest(), 
                                    Crypto.Cipher.AES.MODE_CFB, 
                                    iv).encrypt(s.encode())

Then, later, a user inputs the password pwd2 and we decrypt it with:

iv, cipher = s2[:Crypto.Cipher.AES.block_size], s2[Crypto.Cipher.AES.block_size:]

s3 = Crypto.Cipher.AES.new(Crypto.Hash.SHA256.new(pwd2).digest(),
                           Crypto.Cipher.AES.MODE_CFB, 
                           iv).decrypt(cipher)

Problem: the last line works even if the entered password pw2 is wrong. Of course the decrypted text will be random chars, but no error is triggered.

Question: how to make Crypto.Cipher.AES.new(...).decrypt(cipher) fail if the password pw2 is incorrect? Or at least how to detect a wrong password?


Here is a linked question: Making AES decryption fail if invalid password and here a discussion about the cryptographic part (less programming) of the question: AES, is this method to say “The password you entered is wrong” secure? .

Basj
  • 41,386
  • 99
  • 383
  • 673
  • 1
    [cross posted to crypto.se](https://crypto.stackexchange.com/q/76155/29554)... – Ella Rose Dec 04 '19 at 16:06
  • 1
    @EllaRose The 2 questions are slightly different: first a programming problem, thus this general question on SO. Then a possible solution was found somewhere else, involving testing if decrypt(encrypt(header)) == header. Then I wanted to check if this technique is considered as good or bad in cryptography (so it was more a language-agnostic question, and definitely not a Python programming question ; I still included a Python code for completeness, but the question was not about code, I could have posted in pseudo-code), the latter was the one on crypto.se. – Basj Dec 04 '19 at 17:09

4 Answers4

5

AES provides confidentiality but not integrity out of the box - to get integrity too, you have a few options. The easiest and arguably least prone to "shooting yourself in the foot" is to just use AES-GCM - see this Python example or this one.

You could also use an HMAC, but this generally requires managing two distinct keys and has a few more moving parts. I would recommend the first option if it is available to you.

A side note, SHA-256 isn't a very good KDF to use when converting a user created password to an encryption key. Popular password hashing algorithms are better at this - have a look at Argon2, bcrypt or PBKDF2.

Edit: The reason SHA-256 is a bad KDF is the same reason it makes a bad password hash function - it's just too fast. A user created password of, say, 128 bits will usually contain far less entropy than a random sequence of 128 bits - people like to pick words, meaningful sequences etc. Hashing this once with SHA-256 doesn't really alleviate this issue. But hashing it with a construct like Argon2 that is designed to be slow makes a brute-force attack far less viable.

Basj
  • 41,386
  • 99
  • 383
  • 673
Luke Joshua Park
  • 9,527
  • 5
  • 27
  • 44
  • Thank you for your answer. PS: why is SHA256 a bad key derivation function? – Basj Dec 02 '19 at 20:02
  • PS: in the example you linked, how would you test the integrity? Would you use `nonce`, `tag`, `salt`? Can you just give an example of integrity test in this example (in code or pseudo code)? – Basj Dec 02 '19 at 20:16
  • In the example I linked, the integrity is more or less "baked in" - when you perform a decryption operation, only two things can happen: 1) The decryption succeeds because the key/password, ciphertext, nonce, tag, salt etc. etc. are all correct or 2) The decryption throws an exception because one of those things was not correct. The scenario you describe in your question is not possible in the linked code. – Luke Joshua Park Dec 02 '19 at 20:36
  • Ok thank you! Just to make it complete, what can of `Exception` is thrown in this case? Is there a `IncorrectKeyException` in this module? – Basj Dec 02 '19 at 20:38
  • I believe the error type will be either `ValueError` or `KeyError` depending on what information is passed incorrectly - see the [relevant documentation here](https://pycryptodome.readthedocs.io/en/latest/src/cipher/modern.html#gcm-mode). – Luke Joshua Park Dec 02 '19 at 20:41
  • You could also perform a SHA-256 hash over the calculated key from PBKDF, and include that value together with the ciphertext. That way you don't have to decrypt the entire ciphertext to verify that the password was correct or faulty. That makes sense if the ciphertext is very large, for instance. Of course this value may *also* be changed if the integrity has to be protected against a possible adversary, so you can still not completely distinguish between changed ciphertext or wrong password. – Maarten Bodewes Dec 02 '19 at 22:23
  • Thanks. I'll probably use https://pycryptodome.readthedocs.io/en/latest/src/cipher/modern.html#gcm-mode then (there is code sample there). – Basj Dec 03 '19 at 12:43
  • @LukeJoshuaPark That's only a small part of the reason for SHA256 being a bad KDF. https://www.google.com/search?q=rainbow+tables is the other part. – Legorooj Jan 29 '20 at 10:25
  • @Legorooj It's arguably the main reason though - it's easy enough to add a salt to avoid rainbow tables - but a lot of people use *just* a salted SHA-256 hash - which is why I focused on the less obvious aspects. – Luke Joshua Park Jan 29 '20 at 19:46
  • @LukeJoshuaPark point taken, but they really aught to be using a KDF. – Legorooj Jan 29 '20 at 22:45
3

The best way is to use authenticated encryption, and a modern memory-hard entropy-stretching key derivation function such a scrypt to turn the password into a key. The cipher's nounce can be used as salt for the key derivation. With PyCryptodome that could be:

from Crypto.Random       import get_random_bytes
from Crypto.Cipher       import AES
from Crypto.Protocol.KDF import scrypt

# initialize an AES-128-GCM cipher from password (derived using scrypt) and nonce
def cipherAES(pwd, nonce):
    # note: the p parameter should allow use of several processors, but did not for me
    # note: changing 16 to 24 or 32 should select AES-192 or AES-256 (not tested)
    return AES.new(scrypt(pwd, nonce, 16, N=2**21, r=8, p=1), AES.MODE_GCM, nonce=nonce)

# encryption
nonce = get_random_bytes(16)
print("deriving key from password and nonce, then encrypting..")
ciphertext, tag = cipherAES(b'pwdHklot2',nonce).encrypt_and_digest(b'bonjour')
print("done")

# decryption of nonce, ciphertext, tag
print("deriving key from password and nonce, then decrypting..")
try:
    plaintext = cipherAES(b'pwdHklot2', nonce).decrypt_and_verify(ciphertext, tag)
    print("The message was: " + plaintext.decode())
except ValueError:
    print("Wrong password or altered nonce, ciphertext, tag")
print("done")

Note: Code is here to illustrate the principle. In particular, the scrypt parameters should not be fixed, but rather be included in a header before nonce, ciphertext, and tag; and that must be somewhat grouped for sending, and parsed for decryption.

Caveat: nothing in this post should be construed as an endorsement of PyCryptodome's security.


Addition (per request):

We need scrypt or some other form of entropy stretching only because we use a password. We could use a random 128-bit key directly.

PBKDF2-HMAC-SHAn with 100000 iterations (as in the OP's second code fragment there) is only barely passable to resist Hashcat with a few GPUs. It would would be almost negligible compared to other hurdles for an ASIC-assisted attack: a state of the art Bitcoin mining ASIC does more than 2*1010 SHA-256 per Joule, 1 kWh of electricity costing less than $0.15 is 36*105 J. Crunching these numbers, testing the (62(8+1)-1)/(62-1) = 221919451578091 passwords of up to 8 characters restricted to letters and digits cost less than $47 for energy dedicated to the hashing part.

scrypt is much more secure for equal time spent by legitimate users because it requires a lot of memory and accesses thereof, slowing down the attacker, and most importantly making the investment cost for massively parallel attack skyrocket.

fgrieu
  • 2,724
  • 1
  • 23
  • 53
2

Doesn't use the Crypto package, but this should suit your needs:

import base64
import os

from cryptography.fernet import Fernet
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives.kdf.scrypt import Scrypt


def derive_password(password: bytes, salt: bytes):
    """
    Adjust the N parameter depending on how long you want the derivation to take.
    The scrypt paper suggests a minimum value of n=2**14 for interactive logins (t < 100ms),
    or n=2**20 for more sensitive files (t < 5s).
    """
    kdf = Scrypt(salt=salt, length=32, n=2**16, r=8, p=1, backend=default_backend())
    key = kdf.derive(password)
    return base64.urlsafe_b64encode(key)


salt = os.urandom(16)
password = b'legorooj'
bad_password = b'legorooj2'

# Derive the password
key = derive_password(password, salt)
key2 = derive_password(bad_password, salt)  # Shouldn't re-use salt but this is only for example purposes

# Create the Fernet Object
f = Fernet(key)

msg = b'This is a test message'

ciphertext = f.encrypt(msg)

print(msg, flush=True)  # Flushing pushes it strait to stdout, so the error that will come
print(ciphertext, flush=True)

# Fernet can only be used once, so we need to reinitialize
f = Fernet(key)

plaintext = f.decrypt(ciphertext)

print(plaintext, flush=True)

# Bad Key
f = Fernet(key2)
f.decrypt(ciphertext)
"""
This will raise InvalidToken and InvalidSignature, which means it wasn't decrypted properly.
"""

See my comment for links to the documentation.

Legorooj
  • 2,646
  • 2
  • 15
  • 35
1

For future reference, here is a working solution following the AES GCM mode (recommended by @LukeJoshuaPark in his answer):

from Crypto.Cipher import AES
from Crypto.Random import get_random_bytes

# Encryption
data = b"secret"
key = get_random_bytes(16)
cipher = AES.new(key, AES.MODE_GCM)
ciphertext, tag = cipher.encrypt_and_digest(data)
nonce = cipher.nonce

# Decryption
key2 = get_random_bytes(16)  # wrong key
#key2 = key  # correct key
try:
    cipher = AES.new(key2, AES.MODE_GCM, nonce=nonce)
    plaintext = cipher.decrypt_and_verify(ciphertext, tag)
    print("The message was: " + plaintext.decode())
except ValueError:
    print("Wrong key")

It does fail with an exception when the password is wrong indeed, as desired.


The following code uses a real password derivation function:

import Crypto.Random, Crypto.Protocol.KDF, Crypto.Cipher.AES

def cipherAES(pwd, nonce):
    return Crypto.Cipher.AES.new(Crypto.Protocol.KDF.PBKDF2(pwd, nonce, count=100000), Crypto.Cipher.AES.MODE_GCM, nonce=nonce)

# encryption
nonce = Crypto.Random.new().read(16)
cipher = cipherAES(b'pwd1', nonce)
ciphertext, tag = cipher.encrypt_and_digest(b'bonjour')

# decryption
try:
    cipher = cipherAES(b'pwd1', nonce=nonce)
    plaintext = cipher.decrypt_and_verify(ciphertext, tag)
    print("The message was: " + plaintext.decode())
except ValueError:
    print("Wrong password")

@fgrieu's answer is probably better because it uses scrypt as KDF.

Basj
  • 41,386
  • 99
  • 383
  • 673
  • I made an [answer](https://stackoverflow.com/a/59160861/903600) with working code derived from yours. – fgrieu Dec 03 '19 at 18:23
  • @fgrieu Thank you very much. I edited mine to remove my previous (now pointless) error, and gave credit to your answer, which is probably better because of `scrypt`. – Basj Dec 03 '19 at 18:55
  • 1
    Wow, very interesting @fgrieu! Could you include this in your answer (including cost calculations, it's interesting for future reference)? – Basj Dec 04 '19 at 12:56