The fastest approach is really bound to I/O capabilities and computational speed. Usually the best method to understand what is the fastest technique is to benchmark them.
Disclaimer: Results are (of course) bound to my machine and may vary significantly on different hardware. For testing I have used a single text file of about 400MB in size. If interested the file may be downloaded here (zipped). Executable compiled as x86.
Option 1: Read entire file, no parallelization
long count = 0;
var text = File.ReadAllText("C:\\tmp\\test.txt");
for(var i = 0; i < text.Length; i++)
if (text[i] == '@')
count++;
Results:
- Average execution time:
5828 ms
- Average process memory:
1674 MB
This is the "naive" approach, which reads the entire file in memory and then uses a for
loop (which is significantly faster than foreach
or LINQ).
As expected process occupied memory is very high (about 4 times the file size), this may be caused by a combination of string size in memory (more info here) and string processing overhead.
Option 2: Read file in chunks, no parallelization
long count = 0;
using(var file = File.OpenRead("C:\\tmp\\test.txt"))
using(var reader = new StreamReader(file))
{
const int size = 500000; // chunk size 500k chars
char[] buffer = new char[size];
while(!reader.EndOfStream)
{
var read = await reader.ReadBlockAsync(buffer, 0, size); // read chunk
for(var i = 0; i < read; i++)
if(buffer[i] == '@')
count++;
}
}
Results:
- Average execution time:
4819 ms
- Average process memory:
7.48 MB
This was unexpected. In this version we are reading the file in chunks of 500k characters instead of loading it entirely in memory, and execution time is even lower than the previous approach. Please note that reducing the chunk size will increase execution time (because of the overhead). Memory consumption is extremely low (as expected, we are only loading roughly 500kB/1MB in memory directly into a char array).
Better (or worse) performance may be obtained by changing the chunk size.
Option 3: Read file in chunks, with parallelization
long count = 0;
using(var file = File.OpenRead("C:\\tmp\\test.txt"))
using(var reader = new StreamReader(file))
{
const int size = 2000000; // this is roughly 4 times the single threaded value
const int parallelization = 4; // this will split chunks in sub-chunks processed in parallel
char[] buffer = new char[size];
while(!reader.EndOfStream)
{
var read = await reader.ReadBlockAsync(buffer, 0, size);
var sliceSize = read/parallelization;
var counts = new long[parallelization];
Parallel.For(0, parallelization, i => {
var start = i * sliceSize;
var end = start + sliceSize;
if(i == parallelization)
end += read % parallelization;
long localCount = 0;
for(var j = start; j < end; j++)
{
if(buffer[(int)j] == '@')
localCount++;
}
counts[i] = localCount;
});
count += counts.Sum();
}
}
Results:
- Average execution time:
3363 ms
- Average process memory:
10.37 MB
As expected this version performs better the the single threaded one, but not 4 times better as we could have thought. Memory consumption is again very low compared to the first version (same considerations as before) and we are taking advantage of multi-core environments.
Parameters like chunk size and number of parallel tasks may significantly change the results, you should just go by trial and error to find what is the best combination for you.
Conclusions
I was inclined to think that the "load everything in memory" version was the fastest, but this really depends on the overhead of string processing and I/O speed. The parallel-chunked approach seems the fastest in my machine, this should lead you to an idea: when in doubt just benchmark it.