Is that <cstdio>
header in C++ contains just the same functions as <stdio.h>
but put in std
namespace?
I experienced strange efficiency problems in my program compiled with mingw-w64, which is more than ten times slower then on linux. After some test I found that the problem is in sprintf
.
Then I did the following test:
#include <stdio.h>
// #include <cstdio>
// using std::sprintf;
int main () {
int i;
for (i = 0; i < 500000; i++){
char x[100];
sprintf(x, "x%dx%dx", i, i<<2);
}
}
When compiled with <stdio.h>
it is 15 times faster then using <cstdio>
. Here is the timing:
$ time ./stdio
real 0m0.557s
user 0m0.046s
sys 0m0.046s
$ time ./cstdio
real 0m7.465s
user 0m0.031s
sys 0m0.077s
$ g++ --version
g++.exe (rubenvb-4.8-stdthread) 4.8.1 20130324 (prerelease)
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
UPDATE 1:
I further timed with different mingw-w64 build (rubenvb, drangon, and mingw-build), and find that all 32bit version using <cstdio>
timed 4.x seconds and 64bit versions 7.x~8.x seconds. And all versions using <stdio.h>
timed around 0.4~0.6 second.
UPDATE 2:
I disassembled the main function in gdb and find only one line differs: the <stdio.h>
version calls callq 0x4077c0 <sprintf>
but the <cstdio>
version calls callq 0x407990 <_Z7sprintfPcPKcz>
.
sprintf
contains:
0x00000000004077c0 <+0>: jmpq *0x7c6e(%rip) # 0x40f434 <__imp_sprintf>
0x00000000004077c6 <+6>: nop
0x00000000004077c7 <+7>: nop
Following __imp_sprintf
I reached the sprinf
inside msvcrt.dll
.
_Z7sprintfPcPKcz
contains some mingw codes:
0x0000000000407990 <+0>: push %rbp
0x0000000000407991 <+1>: push %rbx
0x0000000000407992 <+2>: sub $0x38,%rsp
0x0000000000407996 <+6>: lea 0x80(%rsp),%rbp
0x000000000040799e <+14>: mov %rcx,-0x30(%rbp)
0x00000000004079a2 <+18>: mov %r8,-0x20(%rbp)
0x00000000004079a6 <+22>: mov %r9,-0x18(%rbp)
0x00000000004079aa <+26>: mov %rdx,-0x28(%rbp)
0x00000000004079ae <+30>: lea -0x20(%rbp),%rax
0x00000000004079b2 <+34>: mov %rax,-0x58(%rbp)
0x00000000004079b6 <+38>: mov -0x58(%rbp),%rdx
0x00000000004079ba <+42>: mov -0x28(%rbp),%rax
0x00000000004079be <+46>: mov %rdx,%r8
0x00000000004079c1 <+49>: mov %rax,%rdx
0x00000000004079c4 <+52>: mov -0x30(%rbp),%rcx
0x00000000004079c8 <+56>: callq 0x402c40 <__mingw_vsprintf>
0x00000000004079cd <+61>: mov %eax,%ebx
0x00000000004079cf <+63>: mov %ebx,%eax
0x00000000004079d1 <+65>: add $0x38,%rsp
0x00000000004079d5 <+69>: pop %rbx
0x00000000004079d6 <+70>: pop %rbp
Why does cstdio
use a different (and much slower) function?