4

I have a piece of software that performs a set of experiments (C++). Without storing the outcomes, all experiments take a little over a minute. The total amount of data generated is equal to 2.5 Gbyte, which is too large to store in memory till the end of the experiment and write to file afterwards. Therefore I write them in chunks.

for(int i = 0; i < chunkSize;i++){
    outfile << results_experiments[i] << endl;
}

where ofstream outfile("data"); and outfile is only closed at the end.

However when I write them in chunks of 4700 kbytes (actually 4700/Chunksize = size of results_experiments element) the experiments take about 50 times longer (over an hour...). This is unacceptable and makes my prior optimization attempts look rather silly. Especially since these experiments again need to be perfomed using many different parameter settings ect.. (at least 100 times, but preferably more)

Concrete my question is:

  • What would be the ideal chunksize to write at?

  • Is there a more efficient way than (or something very inefficient in) the way I write data currently?

Basically: Help me getting the file IO overhead introduced as small as possible..

I think it should be possible to do this a lot faster as copying (writing & reading!) the resulting file (same size), takes me under a minute..

The code should be fairly platform independent and not use any (non standard) libraries (I can provide seperate versions for seperate platforms & more complicated install instructions, but it is a hassle..) If it is not feasible to get the total experiment time under 5 minutes, without platform/library dependencies (and possible with), I will seriously consider introducing these. (platform is windows, but a trivial linux port should at least be possible)

Thank you for your effort.

codelidoo
  • 219
  • 1
  • 11
  • 11
    Every `endl` flushes the buffer. You don't want that. Use `<< '\n'`. – pmr Jun 06 '12 at 16:23
  • Nobody can tell you what the ideal I/O chunk size is - it depends on too many factors, including hardware ones. Experiment until you find the one that's right for your system. You can still hide some of the I/O latency by overlapping computation and I/O - write the result of the previous computation while the current one is in progress. – Hristo Iliev Jun 06 '12 at 16:41
  • However, we _can_ tell you that most 32bit windows work very well with chunk sizes of 4k. – Mooing Duck Jun 06 '12 at 19:26
  • Try `fprintf` instead of iostreams. iostreams are horribly slow. http://stackoverflow.com/questions/4340396/does-the-c-standard-mandate-poor-performance-for-iostreams-or-am-i-just-deali – Ben Voigt Jul 19 '12 at 18:33

3 Answers3

5

For starters not flushing the buffer for every chunk seems like a good idea. It also seems possible to do the IO asynchronously, as it is completely independent of the computation. You can also use mmap to improve the performance of File I/O.

pmr
  • 58,701
  • 10
  • 113
  • 156
  • "platform is Windows" seems to rule out using `mmap`. – Jerry Coffin Jun 06 '12 at 16:25
  • 1
    @JerryCoffin I'm sure windows has something equivalent. There also seems to be a portable Boost implementation. I have never used it though and cannot comment on it. – pmr Jun 06 '12 at 16:31
  • Windows has CreateFileMapping and MapViewOfFile, but they rarely improve I/O performance. – Jerry Coffin Jun 06 '12 at 16:32
  • your comment, removing the endl and replacing it by '\n', got me a 4x speedup. I'll look into the binary & multithreading. – codelidoo Jun 08 '12 at 15:17
  • I decided to keep it synchronous. I don't expect it to have a huge influence of total runtime. Since all my computations take only 1 and a half minute, even if I could do them fully concurrently with the file writing, the total runtime would at most decrease by 1 and a half minute. Correct me if I'm wrong here. – codelidoo Jun 09 '12 at 23:20
3

If the output doesn't have to be human-readable, then you could investigate a binary format. Storing data in binary format occupies less space than text format and therefore needs less disk i/o. But there'll be little difference if the data is all strings. So if you write out as much as possible as numbers and not formatted text you could get a big gain.

However I'm not sure if/how this is done with STL iostreams. The C-style way is using fopen(..., "wb") and fwrite(&object, ...).

I think boost::Serialisation can do binary output using << operator.

Also, can you reduce the amount you write? e.g. no formatting or redundant text, just the bare minimum.

acraig5075
  • 10,588
  • 3
  • 31
  • 50
  • `outfile.write(static_cast(results_experiments[i]), sizeof(results_experiments[i]));` – Mooing Duck Jun 06 '12 at 19:29
  • Using binary actually made it slower in my case. I think I might be writing more instead of less. Result_experiments[i] is an integer (only 1-3 digits long), assmuming 4 bytes per integer and 1 byte per character in the string that would make sense. – codelidoo Jun 08 '12 at 15:38
0

Whether endl flushes the buffer when writing to a ofstream is implementation dependent--

You might also try increasing the buffer size of your ofstream

char *biggerbuffer = new char[512000];

  outfile.rdbuf()->pubsetbuf(biggerbuffer,512000);

The availability of pubsetbuf may vary depending on your iostream implementation

antlersoft
  • 14,636
  • 4
  • 35
  • 55
  • 1
    AFAIK inserting `endl` into a `basic_ostream` will always call `flush`. – pmr Jun 06 '12 at 17:09
  • The biggerbuffer did not influence my runtime at all, I tried a number of different formats from modest to rather huge. No difference but higher memory usage. – codelidoo Jun 09 '12 at 23:25