97

The application I am working on lets the user encrypt files. The files could be of any format (spreadsheet, document, presentation, etc.).

For the specified input file, I create two output files - an encrypted data file and a key file. You need both these files to obtain your original data. The key file must work only on the corresponding data file. It should not work on any other file, either from the same user or from any other user.

AES algorithm requires two different parameters for encryption, a key and an initialization vector (IV).

I see three choices for creating the key file:

  1. Embed hard-coded IV within the application and save the key in the key file.
  2. Embed hard-coded key within the application and save the IV in the key file.
  3. Save both the key and the IV in the key file.

Note that it is the same application that is used by different customers.

It appears all three choices would achieve the same end goal. However, I would like to get your feedback on what the right approach should be.

sashoalm
  • 75,001
  • 122
  • 434
  • 781
Peter
  • 11,260
  • 14
  • 78
  • 155

5 Answers5

112

As you can see from the other answers, having a unique IV per encrypted file is crucial, but why is that?

First - let's review why a unique IV per encrypted file is important. (Wikipedia on IV). The IV adds randomness to your start of your encryption process. When using a chained block encryption mode (where one block of encrypted data incorporates the prior block of encrypted data) we're left with a problem regarding the first block, which is where the IV comes in.

If you had no IV, and used chained block encryption with just your key, two files that begin with identical text will produce identical first blocks. If the input files changed midway through, then the two encrypted files would begin to look different beginning at that point and through to the end of the encrypted file. If someone noticed the similarity at the beginning, and knew what one of the files began with, he could deduce what the other file began with. Knowing what the plaintext file began with and what it's corresponding ciphertext is could allow that person to determine the key and then decrypt the entire file.

Now add the IV - if each file used a random IV, their first block would be different. The above scenario has been thwarted.

Now what if the IV were the same for each file? Well, we have the problem scenario again. The first block of each file will encrypt to the same result. Practically, this is no different from not using the IV at all.

So now let's get to your proposed options:

Option 1. Embed hard-coded IV within the application and save the key in the key file.

Option 2. Embed hard-coded key within the application and save the IV in the key file.

These options are pretty much identical. If two files that begin with the same text produce encrypted files that begin with identical ciphertext, you're hosed. That would happen in both of these options. (Assuming there's one master key used to encrypt all files).

Option 3. Save both the key and the IV in the key file.

If you use a random IV for each key file, you're good. No two key files will be identical, and each encrypted file must have it's key file. A different key file will not work.

PS: Once you go with option 3 and random IV's - start looking into how you'll determine if decryption was successful. Take a key file from one file, and try using it to decrypt a different encryption file. You may discover that decryption proceeds and produces in garbage results. If this happens, begin research into authenticated encryption.

Community
  • 1
  • 1
Tails
  • 3,350
  • 2
  • 17
  • 19
  • Thank you for your help. One question. Is IV not needed for decryption? In this case, for each input file that requires encryption, I can generate both the key and the IV randomly but I won't have to save the IV. If IV is indeed required for decryption, I will save IV+KEY in the key file. This key file would be required to decrypt the encrypted file. – Peter Feb 09 '12 at 17:57
  • 2
    The IV is required for decryption. – Tails Feb 14 '12 at 04:23
  • 6
    However, (at least in CBC mode) a wrong IV will only corrupt the first block, you can still decrypt the remaining file content. – MV. Jul 07 '13 at 00:57
  • 2
    I see comments similar to the above in a couple of places here ("a wrong IV will only corrupt the first block, you can still decrypt the remaining file content"). This is not true. Since the encrypted first block is the IV for the second block (and so on), an unknown IV means no blocks can be decrypted. The CBC diagram on Wikipedia makes this pretty clear: [link](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Cipher_Block_Chaining_.28CBC.29) – Rich Feb 12 '16 at 18:47
  • 9
    @Rich - I know my comment is 4 years late, but... I tried using a corrupted IV do decrypt using .NET AES libraries. Only the first block was corrupted. This is because, the encrypted block is the IV of the next block in CBC... And when decrypting other than the first block, you always have the encrypted prior block. – Les Aug 12 '16 at 23:03
  • 3
    @Les - Maybe 4 years late, but you're absolutely right. My above comment is completely wrong for CBC. No idea what I was thinking. Thanks. – Rich Aug 14 '16 at 21:09
  • "Knowing what the plaintext file began with and what it's corresponding ciphertext is could allow that person to determine the key and then decrypt the entire file." According to [this](https://crypto.stackexchange.com/questions/3952/is-it-possible-to-obtain-aes-128-key-from-a-known-ciphertext-plaintext-pair) question you would be very famous and rich if you could do that, so you are either indeed very rich and famous or one of us is missing something. (In the original question he was talking about AES, which means the key is at least 128 bits long.) – Trigary Jul 13 '18 at 07:35
  • https://www.youtube.com/watch?v=0D7OwYp6ZEc great explanation here too – BEWARB Sep 16 '21 at 07:41
42

The important thing about an IV is you must never use the same IV for two messages. Everything else is secondary - if you can ensure uniqueness, randomness is less important (but still a very good thing to have!). The IV does not need to be (and indeed, in CBC mode cannot be) secret.

As such, you should not save the IV alongside the key - that would imply you use the same IV for every message, which defeats the point of having an IV. Typically you would simply prepend the IV to the encrypted file, in the clear.

If you are going to be rolling your own cipher modes like this, please read the relevant standards. The NIST has a good document on cipher modes here: http://dx.doi.org/10.6028/NIST.SP.800-38A IV generation is documented in Appendix C. Cryptography is a subtle art. Do not be tempted to create variations on the normal cipher modes; 99% of the time you will create something that looks more secure, but is actually less secure.

Qrilka
  • 612
  • 1
  • 6
  • 19
bdonlan
  • 224,562
  • 31
  • 268
  • 324
  • Hello all. I have read all the replies. For each input file, I can generate both, the key and the IV, randomly. By making the key and the IV different for each file, a hacker will have to try more combinations. From a high level perspective, it appears to me that IV is just another key. Is this right? – Peter Jan 29 '12 at 03:13
  • @PhilBolduc: You'd still have to prepend the salt to the encrypted file, and then you must as well have have just prepended a random IV. – President James K. Polk Jan 29 '12 at 03:16
  • 11
    @Peter, that is not what an IV is for. In particular, if the IV is unknown, but the key is known, in CBC mode the hacker will be unable to recover _the first block of the plaintext_. They will, however, be able to recover the rest of the plaintext. The only purpose of the IV is to perturb the file so that repeated encryptions do not produce the same output (thus, the attacker can't tell that two files have the same contents by seeing that the ciphertext is the same). – bdonlan Jan 29 '12 at 03:24
  • 2
    Edit: I deleted my previous comments. I agree, reading I http://cwe.mitre.org/data/definitions/329.html indicates you should use a random IV and not reuse it. Basing it off of the password, salt, etc would violate that. – Phil Bolduc Jan 29 '12 at 03:40
  • 4
    It would make sense to use a static IV if you only use it to encrypt randomized data (session keys or other derived keys). Otherwise you should use a randomized IV, and if you've got the space for the additional bytes for each encrypted message, you might as well use one all the time. – Maarten Bodewes Jan 29 '12 at 14:21
  • 1
    @owlstead, if you use a fixed IV it is critical to ensure that the first plaintext block of the message is always unique. It's not enough that the message as a whole is unique. Additionally, if your message is the size of a single plaintext block (eg, derived keys) and unique, you can simply use ECB mode. – bdonlan Jan 29 '12 at 18:47
  • 1
    @bdonlan: of course the first block has to be unique, but that's the case with randomized data. If the message is the size of a single plain text block you can use ECB mode, but only if you don't reuse the key - devil is in the details :) – Maarten Bodewes Jan 29 '12 at 22:19
  • 2
    The IV has a different purpose depending on the mode of operation used. In CTR, it has to be unique in order to prevent a [many-time pad](http://crypto.stackexchange.com/q/6020/13022). In CBC, it to be [unpredictable](http://crypto.stackexchange.com/q/6702/13022) and not unique. A message counter is unique and would be OK for CTR mode, but would be bad for CBC mode. – Artjom B. Aug 13 '16 at 09:11
15

When you use an IV, the most important thing is that the IV should be as unique as possible, so in practice you should use a random IV. This means embedding it in your application is not an option. I would save the IV in the data file, as it does not harm security as long as the IV is random/unique.

gpeche
  • 21,974
  • 5
  • 38
  • 51
  • Ultimately, the idea is to ensure that a hacker cannot break open the encrypted file. The size of IV seems to be less than the size of the key. If key is fixed and IV is varied, as you suggested, a hacker will have less number of combinations to try to break open the file. Is there something I am missing? – Peter Jan 29 '12 at 03:10
  • 17
    The IV isn't to 'ensure that a hacker cannot break open the encrypted file'. It's to ensure that, if you encrypt the same file twice, it'll produce different encrypted output. – bdonlan Jan 29 '12 at 03:23
  • 1
    bdolan That little message finally made the coin drop for me.. I was struggeling with understanding how the IV is important compared to message-length, but I see it is not really, but it is instead important compared to message-content.. Thanks! – DusteD Jan 11 '14 at 11:25
5

Key/Iv pairs likely the most confused in the world of encryption. Simply put, password = key + iv. Meaning you need matching key and iv to decrypt an encrypted message. The internet seems to imply you only need iv to encrypt and toss it away but its also required to decrypt. The reason for spitting the key/iv values is to make it possible to encrypt same messages with the same key but use different Iv to get unequal encrypted messages. So, Encrypt("message", key, iv) != Encrypt("message", key, differentIv). The idea is to use a new random Iv value every time a message is encrypted. But how do you manage an ever changing Iv value? There's a million possibilities but the most logical way is to embed the 16 byte Iv within the encrypted message itself. So, encrypted = Iv + encryptedMessage. This way the contently changing Iv value can be pulled and removed from the encrypted message then decrypted. So decryptedMessage = Decrypt("messageWithoutIv", key, IvFromEncryptedMessage). Alternatively if storing encrypted messages in a database Iv could be stored in a field there. Although its true Iv is part of the secret, its tiny in comparison to the 32 bit key and is never reused so it is practically safe to expose publicly. Keep in mind, iv has nothing to do with encruotion, it has to do with masking encryption of messages having the same content.

Russ Ebbing
  • 478
  • 5
  • 7
1

IV is used for increase the security via randomness, but that does not mean it is used by all algorithm, i.e. enter image description here

The trick thing is how long should the IV be? Usually it is the same size as the block size, or cipher size. For example, AES would have 16 bytes for IV. Besides, IV type can also be selected, i.e. eseqiv, seqiv, chainiv ...

LinconFive
  • 1,718
  • 1
  • 19
  • 24