Scenario 1:
On disk, char is stored as ASCII, 1 byte per character in a txt file. In memory, C# char is stored as unicode, 2 bytes per character. With that said, if you are loading a Y MB text file from disk using C#, it will take more than 2*Y MB memory or more than double. So make sure you have enough memory at your disposal. (But this was not my case)
Scenario 2:
Moreover, you might have enough memory but not enough contiguous memory. For example you might have 20 GB free memory, but only 1 GB might be available as a single block. Memory is fragmented.
In that case, if you try to create a string or character array greater than 1 GB, you'll get "out of memory". (This was my case)
Solution:
- If you really want to work in memory, load the file in chunks or line by line and store the chunks in a data structure like linked list, to avoid allocating contiguous blocks. Linked list or similar data structure will allocate distributed but linked memory. Data structures like String, List, Dictionary, HashSet allocate fully and/or partial contiguous blocks so avoid them.
- Depends on the problem, but if your problem allows, stream the file into a database for further processing, searching, updating, deleting etc. You'll have to deal with some IO latency though unless you use a fully in-memory DB.