2

[Update] 2016.07.02

main.c

#include <stdio.h>
#include <string.h>

size_t strlen(const char *str) {
    printf("%s\n", str);
    return 99;
}
int main() {
    const char *str = "AAA";
    size_t a = strlen(str);
    strlen(str);
    size_t b = strlen("BBB");
    return 0;
}

The expected output is

AAA
AAA
BBB

Compile with gcc -O0 -o main main.c :

AAA

Compile with gcc -O3 -o main main.c

AAA
AAA

The corresponding asm code with flag -O0

000000000040057c <main>:
40057c: 55                      push   %rbp
40057d: 48 89 e5                mov    %rsp,%rbp
400580: 48 83 ec 20             sub    $0x20,%rsp
400584: 48 c7 45 e8 34 06 40    movq   $0x400634,-0x18(%rbp)
40058b: 00 
40058c: 48 8b 45 e8             mov    -0x18(%rbp),%rax
400590: 48 89 c7                mov    %rax,%rdi
400593: e8 c5 ff ff ff          callq  40055d <strlen>
400598: 48 89 45 f0             mov    %rax,-0x10(%rbp)
40059c: 48 c7 45 f8 03 00 00    movq   $0x3,-0x8(%rbp)
4005a3: 00 
4005a4: b8 00 00 00 00          mov    $0x0,%eax
4005a9: c9                      leaveq 
4005aa: c3                      retq   
4005ab: 0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

and with -O3 :

0000000000400470 <main>:
400470: 48 83 ec 08             sub    $0x8,%rsp
400474: bf 24 06 40 00          mov    $0x400624,%edi
400479: e8 c2 ff ff ff          callq  400440 <puts@plt>
40047e: bf 24 06 40 00          mov    $0x400624,%edi
400483: e8 b8 ff ff ff          callq  400440 <puts@plt>
400488: 31 c0                   xor    %eax,%eax
40048a: 48 83 c4 08             add    $0x8,%rsp
40048e: c3                      retq   

With the flag -O0, why is the second and third call to strlen did not call the user defined strlen ?

And with -O3, why is the third call to strlen been optimized out?

mingpepe
  • 489
  • 5
  • 10
  • 1
    Probably because the compiler optimized out the redundant calls, but that's a bit strange considering it has side-effects in your implementation. Not sure how far it'll go with well-known functions like `strlen()`. Single-step the code in a debugger. – unwind Jul 01 '16 at 09:40
  • Assuming this is gcc, try compiling it with `-fno-optimize-strlen` – tofro Jul 01 '16 at 09:44
  • 1
    Change the name of function to `my_strlen` and see what happen. And remove `extern` keyword in function implementation. – LPs Jul 01 '16 at 09:50
  • @tofro The code is compiled by gcc 4.8.4 and with the -fno-optimize-strlen flag, the result is still the same. – mingpepe Jul 01 '16 at 10:05
  • How does this program even pass the linker, with multiple definitions of strlen present? – Lundin Jul 01 '16 at 11:25
  • 1
    @Lundin The linker doesn't look any further in libraries as soon as it has found a symbol that resolves a dependency. Only specified *object* files are forced into the binary - That is, the linker doesn't even *see* multiple definitions. That is UNIX "standard" behavior. – tofro Jul 01 '16 at 11:40

2 Answers2

1

GCC recognizes strlen() has a builtin and replaced it builtins. Hence, your version of strlen() isn't called.

I compiled your code with -fno-builtin which disables the builtins and I get the Log in strlen output twice from the 1st and 3rd strlen() calls. Probably the second strlen() gets optimized away as it's return value isn't used. This probably happens before GCC recognizes that it can't use the builtin for strlen(). Otherwise, it can't optimize away 2nd strlen() call because it has the side effect of printing the message.

Similarly, if store the result of 2nd strlen() call with something like:

size_t b = strlen(str); // call 2

then I see, "Log in strlen()" getting printed 3 times.

If I compile with -O3 (either with or without -fno-builtin), I get no output at all because, as said before, GCC optimizes away the whole program.

This is not an issue with GCC because redefining a standard function is technically undefined behaviour and hence GCC has the liberty handle it in anyway.

P.P
  • 117,907
  • 20
  • 175
  • 238
  • I want to call the user defined strlen but found that not all the function call to strlen will call my function. My question is not to solve the problem to achieve my goal but to understand how did it work. – mingpepe Jul 02 '16 at 01:57
  • 1
    And redefining a standard function is technically undefined behaviour. In what document did you see this infomation? – mingpepe Jul 02 '16 at 01:59
  • `This probably happens before GCC recognizes that it can't use the builtin` Sounds like the logical assumption here. Funny that it's the second time in as many days that a defect is found in different compilers involving optimizations around string builtins/intrinsics. The other (unrelated) one about `strcpy` and MSVC is at [ms vc++ compiler optimizating away erroneous code](http://stackoverflow.com/questions/38111158/ms-vc-compiler-optimizating-away-erroneous-code). – dxiv Jul 02 '16 at 02:15
  • @mingpepe C standard says so. See: http://stackoverflow.com/questions/14770384/is-it-undefined-behavior-to-redefine-a-standard-name – P.P Jul 02 '16 at 08:20
0

Probably build in function enabled by default. If you wise call user define function, then change return type of strlen function.So,

in strlen_adder.h

int strlen(const char*);

in strlen_adder.c

int strlen(const char *s)

Also remove #include<string.h> in main.c file.

msc
  • 33,420
  • 29
  • 119
  • 214