-1

I am facing some issues while reading and processing large files in my program. Files larger than 900 MB force my program to consume an average amount of memory due to the resources needed to perform any task with those selected files, which causes a crash on my computer (when distributing the program, the ideal is that it runs as optimally as possible regardless of the size of the files selected by the user; it should be functional in all circumstances).

I have the fields in the following class assigned in order to establish what should the Minimum and Maximum limit of MemoryStreams' Length when reading files:

/// <summary> Initializes Handling Functions for the Memory Consumed by the Process of this Program. </summary>

internal class Memory_Manager
{
/** <summary> Sets a Value which Contains Info about the Minimum Size of a MemoryStream. </summary>
<returns> The Minimum MemoryStream Size. </returns> */

private static readonly int minBlockSize = Convert.ToInt32(Constant_Values.oneKilobyte * 4); // Min Length: 4 KB

/** <summary> Sets a Value which Contains Info about the Maximum Size of a MemoryStream. </summary>
<returns> The Maximum MemoryStream Size. </returns> */

private static readonly int maxBlockSize = Convert.ToInt32(Constant_Values.oneMegabyte * 250); // Max Length: 250 MB
} // Code fragment

Also, I have a method that checks if that conditions is applied to the case:

/** <summary> Checks if a Buffer meets the Minimum and Maximum Size Limits. </summary>
<param name = "targetBuffers" > The Buffers to be Analized. </param> */

private static void CheckBufferSize(byte[] targetBuffers)
{
byte[] validBuffers = targetBuffers;
int bufferSize = validBuffers.Length;

if(bufferSize < minBlockSize)
{
validBuffers = new byte[minBlockSize]; // Create new MemoryBuffers with the default size
}

else if(bufferSize > maxBlockSize)
{
Array.Resize(ref validBuffers, maxBlockSize); // Resize MemoryBuffers if Size exceeds the Limit
}

targetBuffers = validBuffers;
}

Here is where the method is implemented:

/** <summary> Creates a new MemoryStream with the Specified Bytes. </summary>
<param name = "memoryBuffers" > The Bytes from which the MemoryStream will be Created. </param>

<returns> The MemoryStream Created. </returns> */

public static MemoryStream CreateMemoryStream(byte[] memoryBuffers)
{
CheckBufferSize(memoryBuffers);
return new MemoryStream(memoryBuffers);
}

When working with files, I usually do the operations on 'MemoryStreams' instead of using the 'FileStreams'. After I've finished performing an operation (either encoding the bytes or encrypting them), I copy the bytes from the memory block and write them to a separate file (all inside the 'using' block, of course, in order to dispose the 'MemoryStream' when I've finished with it).

Here's a code snippet where I use MemoryStreams to store buffers data and them, to write them to a new File (AES_Cryptor - DecryptFile):

using(MemoryStream outputFileStream = Memory_Manager.CreateMemoryStream() ) 
{ 
ICryptoTransform fileDecryptor = AES_Cipher.CreateDecryptor(derivedKeyBytes, IV); 

using(CryptoStream decryptedFileStream = new CryptoStream(outputFileStream, fileDecryptor, CryptoStreamMode.Write) ) {
decryptedFileStream.Write(inputFileBytes, Constant_Values.startingIndex, inputFileBytes.Length);
decryptedFileStream.FlushFinalBlock(); 
}

byte[] decryptedFileBytes = Memory_Manager.DumpMemoryStream(outputFileStream); // Dump MemoryStream
Archive_Manager.WriteFileBytes(outputPath, decryptedFileBytes); // Write Decrypted Bytes
} // Code fragment

What tips do you recommend me in order to improve the perfomance of file Reading/Writing tasks. Also, which min and max Size do you considere is the proper for storing bytes on MemoryStreams? I would be more than grateful with your responses!

What I've tryed: to write a method that controls the size of the buffers stored in a MemoryStream (already described).

What I'm expecting: a safer and faster way to read and write info from a file and process it in MemoryStreams and then write it to output files.

  • Consider using [`RecyclableMemoryStream`](https://github.com/microsoft/Microsoft.IO.RecyclableMemoryStream) from MSFT (nuget [here](https://www.nuget.org/packages/Microsoft.IO.RecyclableMemoryStream/) as a replacement to MemoryStream. It's a *library to provide pooling for .NET `MemoryStream` objects to improve application performance, especially in the area of garbage collection.* Or have you actually profiled to determine whether you have a performance problem with `FileStream`? What are you actually trying to do with your streams? – dbc Jul 20 '23 at 23:52
  • 1
    See also [alternative to MemoryStream for large data volumes](https://stackoverflow.com/q/17921880). – dbc Jul 20 '23 at 23:52
  • 1
    Why do you want to use memory stream at all? – Guru Stron Jul 20 '23 at 23:59
  • Ok, I commited the changes to my question. Now you can see the code comfortably –  Jul 21 '23 at 00:09
  • 1
    In the code shown, there's no need to use a memory stream at all. Just write the `inputFileBytes` to a `FileStream` created at the `outputPath`. You're just making life harder for yourself. (By the way, indentation can make your code a lot easier to read. See [Common C# Coding Conventions: Layout conventions](https://learn.microsoft.com/en-us/dotnet/csharp/fundamentals/coding-style/coding-conventions#layout-conventions).) – dbc Jul 21 '23 at 00:18
  • But if you absolutely must use `MemoryStream` for some reason, `Microsoft.IO.RecyclableMemoryStream` may be a decent replacement. Be sure to avoid calling `GetBuffer()` and (especially) `ToArray()`. – dbc Jul 21 '23 at 00:23

1 Answers1

3

MemoryStreams are slow because resizing takes a lot of system resources when reallocating memory and are a strain on your RAM. You can do encryption just fine using FileStreams so I don't get why you want to use MemoryStream.

Your code looks like an attempt to solve a problem which you wouldn't have if you just used FileStreams. I tested this on 2GB files and I had no additional memory usage:

using var input = File.OpenRead("path/to/big_file"); // Open the source file
using var output = File.OpenWrite("path/to/result"); // Open the target file
var aes = Aes.Create(); 
... // Set your Key and IV
using var crypt = new CryptoStream(output, aes.CreateDecryptor(), CryptoStreamMode.Write);
input.CopyTo(crypt);

Also,

private static readonly int minBlockSize = Convert.ToInt32(Constant_Values.oneKilobyte * 4);

should be treated as a constant:

private const int minBlockSize = (int)Constant_Values.oneKilobyte * 4;
Ma_rv
  • 223
  • 1
  • 7
  • You should probably do `using var crypt` rather than just `var crypt`. – dbc Jul 21 '23 at 00:26
  • You're right, I just sandboxed this so I missed that, will edit. – Ma_rv Jul 21 '23 at 00:27
  • Hi, again! Today i tested your methods and it works superfast and withouth consuming an unnecessary amount of RAM of my laptop. Now, I would like to know how can I optimize file reading when trying to encode bytes to base64 and viceversa. I use the 'ReadAllBytes' method but when trying to encode all the 700 MB file, it takes too long to complete (besides the average of amount consumed of RAM). Should I use FileStream and cast it to a byte[] by using the 'FileStream.ToArray()' inside a 'using' block for disposing the file after reading it or it will lead to the same result as File.ReadAllBytes? –  Jul 21 '23 at 23:31
  • `using var crypt = new CryptoStream( input, new ToBase64Transform(), CryptoStreamMode.Read);` Import `ToBase64Transform` from `System.Security.Cryptography` – Ma_rv Jul 25 '23 at 00:06
  • And to answer your second question, yes that'll basically get you the same result. Also, FileStream doesn't have a ToArray() method. Also note that arrays aren't disposable and live on the LOH. – Ma_rv Jul 25 '23 at 00:09