2

I want to encrypt a large file (lets say 64 GB) in the most efficient way in .NET.

How I would implement this:

  1. Create an instance of AesManaged to encrypt the stream of the file (read 64 GB)
  2. Save this stream to disk (because it is to big to hold in memory) (write 64 GB)
  3. Create an instance of HMACSHA512 to compute hash of the saved file (read 64 GB)
  4. Save encrypted data with iv to disk (read & write 64 GB)

Simplified C# Code:

using (var aesManaged = new AesManaged())
{
    using (var msEncrypt = File.OpenWrite(@"C:\Temp\bigfile.bin.tmp"))
    {
        using (var csEncrypt = new CryptoStream(msEncrypt, encryptor, CryptoStreamMode.Write))
        {
            File.OpenRead(@"C:\Temp\bigfile.bin").CopyTo(csEncrypt);
            new MemoryStream(iv).CopyTo(csEncrypt);
        }
    }
}

using (var hmac = new HMACSHA512(hmacKey))
{
    hmacHash = hmac.ComputeHash(File.OpenRead(@"C:\Temp\bigfile.bin.tmp"));
}

byte[] headerBytes;
using (var memoryStream = new MemoryStream())
{
    var header = new Header
    {
        IV = iv,
        HmacHash = hmacHash
    };
    Serializer.Serialize(memoryStream, header);
    headerBytes = memoryStream.ToArray();
}

using (var newfile = File.OpenWrite(@"C:\Temp\bigfile.bin.enc"))
{
    new MemoryStream(MagicBytes).CopyTo(newfile);
    new MemoryStream(BitConverter.GetBytes(headerBytes.Length)).CopyTo(newfile);
    new MemoryStream(headerBytes).CopyTo(newfile);
    File.OpenRead(@"C:\Temp\bigfile.bin.tmp").CopyTo(newfile);
}

This implementation has the disadvantage that I created a second file and that I read multiple times 64 GB from disk.

Is the necessary? How to minimize disk IO and ram allocation?

hdev
  • 6,097
  • 1
  • 45
  • 62
  • No, it is not necessary when done right. Also, step 4 is strange, because the IV should be created before trying to encrypt. It's common to write the IV before the ciphertext to the file. – Artjom B. Jul 28 '16 at 05:21

1 Answers1

5

I always get CryptoStreams wrong, so please excuse my pseudocode. The basic idea is to "chain" streams, so that plaintext gets copied to a cryptostream which does the encryption, which in turn writes data to a cryptostream that does the MACing, which then writes to plain old file stream:

using(var encryptedFileStream = File.OpenWrite("..."))        
using(var macCryptoStream = new CryptoStream(encryptedFileStream, mac, CryptoStreamMode.Write))
using(var encryptCryptoStream = new CryptoStream(macCryptoStream, encryptor, CryptoStreamMode.Write))
using(var inputFileStream = File.OpenRead("..."))
    inputFileStream.CopyTo(encryptCryptoStream);

This way, you only need a single pass through your 64 Gb.

Now, you'll have to somehow store the IV and MAC in the beginning of your encrypted file, so first "resize" it:

using(var encryptedFileStream = File.OpenWrite("..."))   
{
    var offset = YourMagicHeaderLength + IvLength + MacLength;
    encryptedFileStream.SetLength(offset);
    encryptedFileStream.Position = offset;

    // The rest of the code goes here
}

and then, after encrypting and computing MAC, rewind to the very beginning and write them out.

Anton Gogolev
  • 113,561
  • 39
  • 200
  • 288
  • What if I want to add some data to the hmac but not to the resulting filestream? – hdev Sep 22 '16 at 11:28
  • Thanks for helping me figure this out. There's a mistake with HMAC here I think, you should be passing the unencrypted data to the HMAC first. In your code you create an HMAC from the encrypted file stream, which doesn't make sense. – jamie yello Nov 17 '22 at 23:01
  • @jamieyello it's Encrypt-then-MAC, so the code is correct. – Léster Jan 18 '23 at 12:53