I am working on an application which needs to deal with large amounts of data(in GBs). I don't need all the data at once at any moment of time. It is ok to section the data and work only on(and thus bring it into memory) a section at any given instance.
I have read that most applications which need to manipulate large amounts of data, usually do so by making use of memory mapped files. Reading further about memory mapped files, I found that reading/writing data from/into memory mapped files is faster than normal file IO because we end up using highly optimized page file algorithms for performing the read write.
Here are the queries that I have:
- How different is using memory mapped files(I am planning to use boost::file_mapping and I am working on windows) for file IO than using file streams?
- How much faster can I expect the data read/writes to be in case of memory mapped files when compared to using file streams(on a traditional hard disk 7200 rpm)?
- Is memory mapped files the only way to deal with such huge amounts of data? Are there better ways of doing this(considering my use case)?