In my first prototype of application, I have to read around 400,000 files (each 4KB file, around total 1.5 GB data) from hard disk sequentially, and do some operation over the data read from each files, and store the results over RAM. Through this mechanism, I were first accessing I/O for a file and then utilizing CPU for operation, and keep going for another file, but it was very slow process.
To work around, now we first read all the files, and stored all the files data in the RAM, and now doing operation (utilizing CPU). It gave significant improvement.
But in my second phase of development, I have to read 20 GB of data, which now I cannot store in RAM. And, single reading operation with CPU utilization is very time consuming operation.
Can someone please suggest some method to work around this problem?
I am developing this application on Windows in C, with Visual Studio compiler.