6

Is that <cstdio> header in C++ contains just the same functions as <stdio.h> but put in std namespace?

I experienced strange efficiency problems in my program compiled with mingw-w64, which is more than ten times slower then on linux. After some test I found that the problem is in sprintf.

Then I did the following test:

#include <stdio.h>
// #include <cstdio>
// using std::sprintf;

int main () {
  int i;
  for (i = 0; i < 500000; i++){
    char x[100];
    sprintf(x, "x%dx%dx", i, i<<2);
  }
}

When compiled with <stdio.h> it is 15 times faster then using <cstdio>. Here is the timing:

$ time ./stdio

real    0m0.557s
user    0m0.046s
sys     0m0.046s

$ time ./cstdio

real    0m7.465s
user    0m0.031s
sys     0m0.077s

$ g++ --version
g++.exe (rubenvb-4.8-stdthread) 4.8.1 20130324 (prerelease)
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

UPDATE 1: I further timed with different mingw-w64 build (rubenvb, drangon, and mingw-build), and find that all 32bit version using <cstdio> timed 4.x seconds and 64bit versions 7.x~8.x seconds. And all versions using <stdio.h> timed around 0.4~0.6 second.

UPDATE 2: I disassembled the main function in gdb and find only one line differs: the <stdio.h> version calls callq 0x4077c0 <sprintf> but the <cstdio> version calls callq 0x407990 <_Z7sprintfPcPKcz>.

sprintf contains:

0x00000000004077c0 <+0>: jmpq   *0x7c6e(%rip) # 0x40f434 <__imp_sprintf>
0x00000000004077c6 <+6>: nop
0x00000000004077c7 <+7>: nop

Following __imp_sprintf I reached the sprinf inside msvcrt.dll.

_Z7sprintfPcPKcz contains some mingw codes:

0x0000000000407990 <+0>:     push   %rbp
0x0000000000407991 <+1>:     push   %rbx
0x0000000000407992 <+2>:     sub    $0x38,%rsp
0x0000000000407996 <+6>:     lea    0x80(%rsp),%rbp
0x000000000040799e <+14>:    mov    %rcx,-0x30(%rbp)
0x00000000004079a2 <+18>:    mov    %r8,-0x20(%rbp)
0x00000000004079a6 <+22>:    mov    %r9,-0x18(%rbp)
0x00000000004079aa <+26>:    mov    %rdx,-0x28(%rbp)
0x00000000004079ae <+30>:    lea    -0x20(%rbp),%rax
0x00000000004079b2 <+34>:    mov    %rax,-0x58(%rbp)
0x00000000004079b6 <+38>:    mov    -0x58(%rbp),%rdx
0x00000000004079ba <+42>:    mov    -0x28(%rbp),%rax
0x00000000004079be <+46>:    mov    %rdx,%r8
0x00000000004079c1 <+49>:    mov    %rax,%rdx
0x00000000004079c4 <+52>:    mov    -0x30(%rbp),%rcx
0x00000000004079c8 <+56>:    callq  0x402c40 <__mingw_vsprintf>
0x00000000004079cd <+61>:    mov    %eax,%ebx
0x00000000004079cf <+63>:    mov    %ebx,%eax
0x00000000004079d1 <+65>:    add    $0x38,%rsp
0x00000000004079d5 <+69>:    pop    %rbx
0x00000000004079d6 <+70>:    pop    %rbp

Why does cstdio use a different (and much slower) function?

Jiaming Lu
  • 875
  • 6
  • 20
  • Yes, `printf()` is slow. Sorry. (And I'm not surprised either that it performs better on Linux than on Windows.) –  Jun 21 '13 at 13:25
  • 1
    My question is: why it is slow in `cstdio` but fast in `stdio.h`. – Jiaming Lu Jun 21 '13 at 13:26
  • Could you try running the tests a few times? `sprint` and `std::sprintf` are the [exact same function](http://gcc.gnu.org/onlinedocs/gcc-4.6.2/libstdc++/api/a00832_source.html). – Jesse Good Jun 21 '13 at 13:34
  • Yes I ran the test many times, in different mingw-w64 build. Is that a bug? Also I examined the header file and it looks fine, I can't figure out the problem. – Jiaming Lu Jun 21 '13 at 13:37
  • What compile options do you use? On my box either version takes about 0.1s. – Mats Petersson Jun 21 '13 at 13:37
  • No compile options used, just `g++ -o cstdio test.cc`. – Jiaming Lu Jun 21 '13 at 13:39
  • What specific version of mingw-w64 are you using? There are reasons that MinGW provides its own wrappers for the `printf()` family of functions (mostly to do with supporting C99 features that `msvcrt.dll` doesn't always support). I'm just surprised that it only seems to us the wrapper for you based on including `stdio.h` vs. `cstdio`. See http://stackoverflow.com/a/10680635/12711 for some details on why MinGW has wrappers for the `printf()` family. – Michael Burr Jun 23 '13 at 18:33
  • 2
    BIG WARNING: the answer explained how the two versions get selected, but doesn't get at the root of the performance problem, which is actually explained [here](http://stackoverflow.com/questions/13970675/massive-fprintf-speed-difference-without-std-c99) (long story short: the GNU version has a non-cached `getenv` lookup for a compatibility option with VC++, which kills its performance). – Matteo Italia Jan 27 '16 at 12:56

1 Answers1

7

libstdc++ does define __USE_MINGW_ANSI_STDIO during build (config/os/mingw32-w64/os_defines.h), which will turn on the mingw sprintf wrapper. As @Michael Burr pointed out, these wrappers exist for C99/GNU99 compatibility.

Your test does not define __USE_MINGW_ANSI_STDIO, hence you'll not get the wrapper with stdio.h. But since it was defined when building libstdc++, you'll get it with cstdio. If you however define it yourself before including stdio.h, you will get the wrapper again.

So you do get in fact different implementations, and cstdio std::sprintf is not necessarily the same as stdio.h sprintf, at least not when it comes to mingw.

Here is a test. First the source:

#ifdef USE_STDIO
  #include <stdio.h>
#else
  #include <cstdio>
  using std::sprintf;
#endif

int main () {
  int i;
  for (i = 0; i < 500000; i++){
    char x[100];
    sprintf(x, "x%dx%dx", i, i<<2);
  }
}

Results:

$ g++ -o test_cstdio.exe test.cc
$ g++ -o test_stdio.exe -DUSE_STDIO test.cc
$ g++ -o test_stdio_wrap.exe -DUSE_STDIO -D__USE_MINGW_ANSI_STDIO test.cc

$ for x in test_*.exe; do ( echo $x; objdump -d $x | grep sprintf; echo ); done
test_cstdio.exe
  40154a:   e8 41 64 00 00          callq  407990 <_Z7sprintfPcPKcz>
0000000000402c40 <__mingw_vsprintf>:
0000000000407990 <_Z7sprintfPcPKcz>:
  4079c8:   e8 73 b2 ff ff          callq  402c40 <__mingw_vsprintf>

test_stdio.exe
  40154a:   e8 71 62 00 00          callq  4077c0 <sprintf>
00000000004077c0 <sprintf>:
  4077c0:   ff 25 6e 6c 00 00       jmpq   *0x6c6e(%rip)        # 40e434 <__imp_sprintf>

test_stdio_wrap.exe
  40154a:   e8 41 64 00 00          callq  407990 <_Z7sprintfPcPKcz>
0000000000402c40 <__mingw_vsprintf>:
0000000000407990 <_Z7sprintfPcPKcz>:
  4079c8:   e8 73 b2 ff ff          callq  402c40 <__mingw_vsprintf>
nmaier
  • 32,336
  • 5
  • 63
  • 78