8

I was trying to solve a problem on InterviewStreet. After some time I determine that I was actually spending the bulk of my time reading the input. This particular question had a lot of input, so that makes some amount of sense. What doesn't make sense is why the varying methods of input had such different performances:

Initially I had:

std::string command;
std::cin >> command;

Replacing it made it noticeably faster:

char command[5];
cin.ignore();
cin.read(command, 5);

Rewriting everything to use scanf made it even faster

char command;
scanf("get_%c", &command);

All told I cut the time reading the input down by about a 1/3.

I'm wondering there is such a variation in performance between these different methods. Additionally, I'm wondering why using gprof didn't highlight the time I was spending in I/O, rather seeming to point the blame to my algorithm.

Winston Ewert
  • 44,070
  • 10
  • 68
  • 83

4 Answers4

3

There is a big variation in these routines because console input speed almost never matters.

And where it does (Unix shell) the code is written in C, reads directly from the stdin device and is efficient.

Martin Beckett
  • 94,801
  • 28
  • 188
  • 263
  • 3
    What I find particularly funny is how I read in one of the C++ books (was it C++ Bible?) how iostream in theory can be faster than the stdio functions because it uses virtual functions instead of parsing text like "%s %s %d", but that in practice it's 10x-20x slower. I rank this right there with the theoretical possibility of Java actually being faster than C because the VM can profile code on the fly. – sashoalm Feb 01 '12 at 16:18
  • 4
    @satuon: personally I find iostream horrid. The various level of indirections and the unintuitive interface are just mind boggling. The problem of iostream is that its hold and dates back to the early age of C++, where templates and exceptions were still under investigations. So you have some kind of half-baked solution that is frozen for "backward compatibility" reasons. *sigh*. – Matthieu M. Feb 01 '12 at 16:22
  • 1
    So basically, iostream is poorly implemented because nobody uses it for high performance IO? – Winston Ewert Feb 01 '12 at 16:52
  • 2
    @WinstonEwert - it's presumably optomised for flexibility and elegance of design. Although in practice it's probably not optmised for anything since nobody uses it! – Martin Beckett Feb 01 '12 at 19:11
1

At the risk of being downvoted, I/O streams are, in general, slower and bulkier than their C counterparts. That's not a reason to avoid using them though in many purposes as they are safer (ever run into a scanf or printf bug? Not very pleasant) and more general (ex: overloaded insertion operator allowing you to output user-defined types). But I'd also say that's not a reason to use them dogmatically in very performance-critical code.

I do find the results a bit surprising though. Out of the three you listed, I would have suspected this to be fastest:

char command[5];
cin.ignore();
cin.read(command, 5);

Reason: no memory allocations needed and straightforward reading of a character buffer. That's also true of your C example below, but calling scanf to read a single character repeatedly isn't anywhere close to optimal either even at the conceptual level, as scanf must parse the format string you passed in each time. I'd be interested in the details of your I/O code as it seems that there is a reasonable possibility of something wrong happening when scanf calls to read a single character turn out to be the fastest. I just have to ask and without meaning to offend, but is the code truly compiled and linked with optimizations on?

Now as to your first example:

std::string command;
std::cin >> command;

We can expect this to be quite a bit slower than optimal for the reason that you're working with a variable-sized container (std::string) which will have to involve some heap allocations to read in the desired buffer. When it comes to stack vs. heap issues, the stack is always significantly faster, so if you can anticipate the maximum buffer size needed in a particular case, a simple character buffer on the stack will beat std::string for input (even if you used reserve). This is likewise true of an array on the stack as opposed to std::vector but these containers are best used for cases where you cannot anticipate the size in advance. Where std::string can be faster would be cases where people might be tempted to call strlen repeatedly where storing and maintaining a size variable would be better.

As to the details of gprof, it should be highlighting those issues. Are you looking at the full call graph as opposed to a flat profile? Naturally the flat profile could be misleading in this case. I'd have to know some further details on how you are using gprof to give a better answer.

stinky472
  • 6,737
  • 28
  • 27
  • gprof is good (but not great) for identifying bottlenecks in CPU-bound processes. It is bad (really, really bad!) for identifying bottlenecks in I/O-bound processes. – David Hammen Feb 01 '12 at 16:47
  • @David: The authors of gprof *[wrote a paper](http://docs.freebsd.org/44doc/psd/18.gprof/paper.pdf)* about it, but of course, who reads such things? Everybody thinks it's a tool for finding bottlenecks, but they make no such claim. – Mike Dunlavey Feb 01 '12 at 21:48
  • The `stdio` routines have a great deal of optimization in their history and auxiliary performance optimization calls like `setvbuf()`. Even if conceptually `scanf()` seems complex, look at any implementation: that string parsing code is very fast. – wallyk Feb 01 '12 at 21:53
  • @wallyk: true, but even with the fastest string parsing, it's being done per character in his example! It's quite a surprise that char buf[n]; cin.read(buf, n); would be so much slower (actually surprising it's not faster, but iostreams are a very bulky and old part of C++). – stinky472 Feb 02 '12 at 02:23
1

gprof only samples during CPU time, not during blocked time. So, a program may spend an hour doing I/O, and a microsecond doing computation, and gprof will only see the microsecond.

For some reason, this isn't well known.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
0

By default, the standard iostreams are configured to work together and with the C stdio library — in practice this means using cin and cout for things other than interactive input and output tend to be slow.

To get good performance using cin and cout, you need to disable the synchronization with stdio. For high performance input, you might even want to untie the streams.

See the following stackoverflow question for more details.

How to get IOStream to perform better?

Community
  • 1
  • 1