1
void perfprint(unsigned int count)
{
    char a[100] = "fosjkdfjlsjdflw0304802";
    for(unsigned int i = 0;i<count;++i)
    {
        printf("%s", a);
    }
}

void perfcout(unsigned int count)
{
    char a[100] = "fosjkdfjlsjdflw0304802";
    for(unsigned int i = 0;i<count;++i)
    {
        cout << a;
    }
}

Environment : C++, VS 2010, Windows 7, 32-bit, Core-i7, 4GB, 3.40 GHz

I tested both the functions with count = 10000 for 5 times each. Measured the performance using QueryPerformanceCounter.

perfprint > ~850 milliseconds (Avg of 5 runs)

perfcout > ~9000 milliseconds (Avg of 5 runs)

Does this mean printf is ~10x faster than cout?

Edit:

With /Ox, /Ot, No debug information in Release build

and with std::ios_base::sync_with_stdio(false); in perfcout method, result is same for cout i.e. ~9000 millisecs

Edit 2:

To conclude, cout is faster than printf. The reason of the observations above were due to console output. When redirecting output to file, things turned on its head!

leiyc
  • 903
  • 11
  • 23
deepdive
  • 9,720
  • 3
  • 30
  • 38
  • 1
    No it means you don't understand how the iostreams are synced and how that affects their speed. Try calling `std::ios_base::sync_with_stdio(false)`at the start of `perfcout`. – user657267 Oct 22 '15 at 02:50
  • Technically a duplicate of http://stackoverflow.com/questions/9371238/why-is-reading-lines-from-stdin-much-slower-in-c-than-python – CodeMouse92 Oct 22 '15 at 02:52
  • Can you run your test while adding std::ios_base::sync_with_stdio(false)? –  Oct 22 '15 at 02:54
  • Cannot reproduce (Fedora22, gcc 5.1, `-O3`), even without that sync stuff, Both take about 0.018s on my machine for `count == 10000`. Did you forget to enable compiler optimization? – Baum mit Augen Oct 22 '15 at 02:55
  • Something tells me that if you are printing to standard output, you can't be too concerned about performance. – Thomas Matthews Oct 22 '15 at 04:30
  • If you want efficient output, print to a `char` array than use a *block write* to write everything to the output. Multiple print statements is not efficient as one write of a lot of data. – Thomas Matthews Oct 22 '15 at 04:31
  • @ThomasMatthews It's not about what's the solution for chunks of output or multiple statements. It's about performance of these two – deepdive Oct 22 '15 at 04:50

1 Answers1

5

I don't have VS 2010 installed any more, but I did a quick test with VS 2013 and 2015. I modified your code slightly to reduce duplication, and include timing code, giving this:

#include <iostream>
#include <cstdio>
#include <chrono>
#include <string>

template <class F>
int perf(F f) {
    using namespace std::chrono;

    const int count = 1000000;
    char a[100] = "fosjkdfjlsjdflw0304802";

    auto start = high_resolution_clock::now();
    for (unsigned i = 0; i < count; i++)
        f(a);
    auto end = high_resolution_clock::now();

    return duration_cast<milliseconds>(end - start).count();
}

int main() {
    std::cerr << "cout: " << perf([](char const *a) { std::cout << a; }) << "\n";
    std::cerr << "printf: " << perf([](char const *a) { printf("%s", a); }) << "\n";
}

With optimization turned off, cout showed up as slightly faster (e.g., 358 ms vs. 460 for printf) but measuring speed with optimization turned off is fairly meaningless.

With optimization turned on cout won by an even larger margin (191 ms vs 365 ms for printf).

To keep these meaningful, I ran them all with the output redirected to a file. Without that, essentially all you'd measure would be the speed of the console driver, which is essentially meaningless and worthless.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • Console driver speed applies to both `printf` and `cout` – deepdive Oct 22 '15 at 04:57
  • 1
    @deepdive That's true but different functions could be implemented such that one or the other is more significantly impacted. For example, one function could prepare a buffer and send it along to the console host in one call while the other could send each byte individually. This example isn't chosen at random; when stdout is connected to the standard Windows console `cout << "Hello"` results in 5 separate interprocedural calls, one to write each byte, whereas `printf("%s", "Hello")` results in a single IPC. – bames53 Oct 22 '15 at 05:42
  • And if you run the program from mintty on Windows instead of cmd.exe the problem goes away. Because mintty doesn't make things go through the Windows console instead of terrible IPC performance you get efficient file buffering behavior. – bames53 Oct 22 '15 at 05:44
  • @bames53 That makes sense.. When I redirected output to file, `cout ~3 millisecs` and `printf ~5 millisecs`... cout won! – deepdive Oct 22 '15 at 06:05