1

I have a program that writes a temporary file to be used with gnuplot. The file varies in size and it can get to several hundreds of kB, if not MB. Everytime it's written to disk, strace only shows some 8kB at a time. I would like to avoid unnecessary disk writes by setting a buffer greater than this. One of the answers here, on SO, said that 128kB is about the maximum before it starts behaving badly. I have searched and found out that I can modify the buffer, something like this:

int sz {65536};
char buf[sz];
std::ofstream outf {"file.txt"};
outf.rdbuf()->pubsetbuf(&buf[0], sz);

So far, so good, it compiles, but how do I actually use this buffer? In one of the answers, I've seen using reinterpret_cast, but it I don't really understand what's going on there. The C++ reference site isn't very helpful, either. I am not an advanced programmer, can someone please show me how to use this? I am using ofstream, and the file written has both data for plotting, and various settings based on conditionals, so I don't know how to fit those in the buffer.

a concerned citizen
  • 787
  • 2
  • 9
  • 25
  • What do you mean _"how do I actually use this buffer?"_? It is used by the `outf` with any output operation applied. – πάντα ῥεῖ Oct 28 '16 at 16:30
  • Then why is the file wtill written ~8kB at a time? The 4 lines above are in the code now, but they don't make any difference, so I thought I have to use the `buf[sz]` somehow. Am I wrong? – a concerned citizen Oct 28 '16 at 16:33
  • Whilst not supported by the c++ standard `char buf[sz];` may fail miserably to overflow the stack if compiled as an extension. – πάντα ῥεῖ Oct 28 '16 at 16:33
  • @pantarei I use this: `outf << "bla bla" << '\n';`. If there's a loop or a conditional, it's something like this: `if (test) outf << "ok\n"; else outf << "not ok\n";`. – a concerned citizen Oct 28 '16 at 16:36
  • For an output buffer: https://stackoverflow.com/questions/1494182/setting-the-internal-buffer-used-by-a-standard-stream-pubsetbuf For an input buffer: https://stackoverflow.com/questions/56283357/internal-buffer-used-by-standard-input-stream-pubsetbuf – tim May 24 '19 at 00:20
  • @tim It seems that both of those seem to deal with overriding internal implementations to change the buffer, while this deals with how to use/set the buffer, simply. – a concerned citizen May 24 '19 at 06:42

2 Answers2

4

Following the suggestions of @pantarei and @lightnessracesinorbit, I'll write the answer. I apologize if I bent the rules.


According to the cppreference site, the order of setting pubsetbuf matters, because it needs to be set before opening any files, otherwwise it has no effect. So, this is the order of the code as it needs to be (for my case):

int sz {131072};          // buffer size
std::vector<char> buf;   // std::vector instead of C-style char
buf.resize(sz);
std::ofstream outf;      // declaration, only
outf.rdbuf()->pubsetbuf(&buf[0], sz);  // set buffer before...
outf.open("file.txt");                 // ...opening the file
// rest of the code

My files are, most often, below 100k, so a 128k buffer is just fine to avoid too many writes.

a concerned citizen
  • 787
  • 2
  • 9
  • 25
2

The reference documentation clearly states (emphasis mine):

2) The base class version of this function has no effect. The derived classes may override this function to allow removal or replacement of the controlled character sequence (the buffer) with a user-provided array, or for any other implementation-specific purpose.

So what you'll need to extend the buffer is std::basic_filebuf::setbuf() instead.

πάντα ῥεῖ
  • 1
  • 13
  • 116
  • 190
  • *Now* I see that I needed to add the line *before* the "file.txt", so now I first declared `outf`, then `pubsetbuf`, then `outf.open("file.txt")`, and now it writes in one go. Thank you. – a concerned citizen Oct 28 '16 at 16:46
  • @aconcernedcitizen So you accepted an answer that bears no relationship to what you say was the actual solution. – Lightness Races in Orbit Oct 28 '16 at 17:08
  • @LightnessRacesinOrbit I already was wondering about that. Can't delete my answer though :P – πάντα ῥεῖ Oct 28 '16 at 17:10
  • @LightnessRacesinOrbit Not only it's the only answer, but it's the one that set me on the right track, which got the problem solved. I wouldn't say it has "no relationship" at all. If I would need help and someone would use guidance, instead of direct answers, and that guidance would make me end up on the right track, should I not consider that guidance as helpful? At the very least, this is my reasoning, yours, or anyone elses, may be different. – a concerned citizen Oct 29 '16 at 07:16
  • @aconcernedcitizen Well, _guidance_ isn't really the way we want to have problems presented here. It's about facts and concise and solid solutions. It's fine you've got your problem solved now, but may be it's better you write your own answer how so exactly, and release mine such I can delete it. As LRIO mentioned, it's apparently not directly related. – πάντα ῥεῖ Oct 29 '16 at 07:23
  • I answered the question, but don't delete this answer, because it was the "fire starter". – a concerned citizen Oct 29 '16 at 07:53