6

It is said that we can write multiple declarations but only one definition. Now if I implement my own strcpy function with the same prototype :

char * strcpy ( char * destination, const char * source );

Then am I not redefining the existing library function? Shouldn't this display an error? Or is it somehow related to the fact that the library functions are provided in object code form?

EDIT: Running the following code on my machine says "Segmentation fault (core dumped)". I am working on linux and have compiled without using any flags.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *strcpy(char *destination, const char *source);

int main(){
    char *s = strcpy("a", "b");
    printf("\nThe function ran successfully\n");
    return 0;
}

char *strcpy(char *destination, const char *source){
    printf("in duplicate function strcpy");
    return "a";
}

Please note that I am not trying to implement the function. I am just trying to redefine a function and asking for the consequences.

EDIT 2: After applying the suggested changes by Mats, the program no longer gives a segmentation fault although I am still redefining the function.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *strcpy(char *destination, const char *source);

int main(){
    char *s = strcpy("a", "b");
    printf("\nThe function ran successfully\n");
    return 0;
}

char *strcpy(char *destination, const char *source){
    printf("in duplicate function strcpy");
    return "a";
}
Nikunj Banka
  • 11,117
  • 16
  • 74
  • 112

7 Answers7

11

C11(ISO/IEC 9899:201x) §7.1.3 Reserved Identifiers

— Each macro name in any of the following subclauses (including the future library directions) is reserved for use as specified if any of its associated headers is included; unless explicitly stated otherwise.

— All identifiers with external linkage in any of the following subclauses (including the future library directions) are always reserved for use as identifiers with external linkage.

— Each identifier with file scope listed in any of the following subclauses (including the future library directions) is reserved for use as a macro name and as an identifier with file scope in the same name space if any of its associated headers is included.

If the program declares or defines an identifier in a context in which it is reserved, or defines a reserved identifier as a macro name, the behavior is undefined. Note that this doesn't mean you can't do that, as this post shows, it can be done within gcc and glibc.

glibc §1.3.3 Reserved Names proveds a clearer reason:

The names of all library types, macros, variables and functions that come from the ISO C standard are reserved unconditionally; your program may not redefine these names. All other library names are reserved if your program explicitly includes the header file that defines or declares them. There are several reasons for these restrictions:

Other people reading your code could get very confused if you were using a function named exit to do something completely different from what the standard exit function does, for example. Preventing this situation helps to make your programs easier to understand and contributes to modularity and maintainability.

It avoids the possibility of a user accidentally redefining a library function that is called by other library functions. If redefinition were allowed, those other functions would not work properly.

It allows the compiler to do whatever special optimizations it pleases on calls to these functions, without the possibility that they may have been redefined by the user. Some library facilities, such as those for dealing with variadic arguments (see Variadic Functions) and non-local exits (see Non-Local Exits), actually require a considerable amount of cooperation on the part of the C compiler, and with respect to the implementation, it might be easier for the compiler to treat these as built-in parts of the language.

Community
  • 1
  • 1
Yu Hao
  • 119,891
  • 44
  • 235
  • 294
  • 2
    so what we are dealing here with is "undefined behavio[u]r", I guess – Walter Tross Jul 13 '13 at 15:58
  • @WalterTross From what I understand, yes. – Yu Hao Jul 13 '13 at 16:01
  • your answer _together_ with the one by @MatsPetersson make the right answer. If I were the OP, I wouldn't know which one to accept. – Walter Tross Jul 13 '13 at 16:05
  • 1
    "Undefined behavior" has to be the ugliest phrase in programming lingo, much the same way an epidemologist doesn't want to think about the phrase "virulent with airborne vector"... – Shadur Jul 13 '13 at 23:18
  • Not particularly relevant to the question, but nevertheless interesting, is that even without the question of overriding the name of a standard function, *all* identifiers starting with "str" are reserved, so something like `strcpy_custom()` would still be bad. – Crowman Jul 15 '13 at 01:57
  • All identifiers starting with "str" are reserved, is there a reference for this? P.S: my answer seems not so relevant because this question has been updated dramatically from its initial state. At first the OP didn't give any code and just asked whether using the same name with library function is OK. – Yu Hao Jul 15 '13 at 02:06
  • Section 7.31.12 in C11: "Function names that begin with `str`, `mem`, or `wcs` and a lowercase letter may be added to the declarations in the `` header", and likewise for ``. So, actually "str" plus a lowercase letter, rather than "str" followed by anything, as I originally stated. – Crowman Jul 15 '13 at 02:54
  • And Section 7.31 itself is what makes them actually reserved, at least if they're external: "All external names described below are reserved no matter what headers are included by the program", as well as your quote from 7.1.3: "(including the future library directions)". – Crowman Jul 15 '13 at 02:57
7

That's almost certainly because you are passing in a destination that is a "string literal".

char *s = strcpy("a", "b");

Along with the compiler knowing "I can do strcpy inline", so your function never gets called.

You are trying to copy "b" over the string literal "a", and that won't work.

Make a char a[2]; and strcpy(a, "b"); and it will run - it probably won't call your strcpy function, because the compiler inlines small strcpy even if you don't have optimisation available.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • I confirm that plain gcc calls my `strcpy` only if the second argument is a variable, _not_ if it is a string literal – Walter Tross Jul 13 '13 at 16:09
4

Putting the matter of trying to modify non-modifiable memory aside, keep in mind that you are formally not allowed to redefine standard library functions.

However, in some implementations you might notice that providing another definition for standard library function does not trigger the usual "multiple definition" error. This happens because in such implementations standard library functions are defined as so called "weak symbols". Foe example, GCC standard library is known for that.

The direct consequence of that is that when you define your own "version" of standard library function with external linkage, your definition overrides the "weak" standard definition for the entire program. You will notice that not only your code now calls your version of the function, but also all class from all pre-compiled [third-party] libraries are also dispatched to your definition. It is intended as a feature, but you have to be aware of it to avoid "using" this feature inadvertently.

You can read about it here, for one example

How to replace C standard library function ?

This feature of the implementation doesn't violate the language specification, since it operates within uncharted area of undefined behavior not governed by any standard requirements.

Of course, the calls that use intrinsic/inline implementation of some standard library function will not be affected by the redefinition.

Community
  • 1
  • 1
AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
2

Your question is misleading.

The problem that you see has nothing to do with the re-implementation of a library function.

You are just trying to write non-writable memory, that is the memory where the string literal a exists.

To put it simple, the following program gives a segmentation fault on my machine (compiled with gcc 4.7.3, no flags):

#include <string.h>

int main(int argc, const char *argv[])
{
    strcpy("a", "b");
    return 0;
}

But then, why the segmentation fault if you are calling a version of strcpy (yours) that doesn't write the non-writable memory? Simply because your function is not being called.

If you compile your code with the -S flag and have a look at the assembly code that the compiler generates for it, there will be no call to strcpy (because the compiler has "inlined" that call, the only relevant call that you can see from main, is a call to puts).

.file   "test.c"
    .section    .rodata
.LC0:
    .string "a"
    .align 8
.LC1:
    .string "\nThe function ran successfully"
    .text
    .globl  main
    .type   main, @function
main:
.LFB2:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $16, %rsp
    movw    $98, .LC0(%rip)
    movq    $.LC0, -8(%rbp)
    movl    $.LC1, %edi
    call    puts
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE2:
    .size   main, .-main
    .section    .rodata
.LC2:
    .string "in duplicate function strcpy"
    .text
    .globl  strcpy
    .type   strcpy, @function
strcpy:
.LFB3:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $16, %rsp
    movq    %rdi, -8(%rbp)
    movq    %rsi, -16(%rbp)
    movl    $.LC2, %edi
    movl    $0, %eax
    call    printf
    movl    $.LC0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE3:
    .size   strcpy, .-strcpy
    .ident  "GCC: (Ubuntu/Linaro 4.7.3-1ubuntu1) 4.7.3"
    .

I think Yu Hao answer has a great explanation for this, the quote from the standard:

The names of all library types, macros, variables and functions that come from the ISO C standard are reserved unconditionally; your program may not redefine these names. All other library names are reserved if your program explicitly includes the header file that defines or declares them. There are several reasons for these restrictions:

[...]

It allows the compiler to do whatever special optimizations it pleases on calls to these functions, without the possibility that they may have been redefined by the user.

Vincenzo Pii
  • 18,961
  • 8
  • 39
  • 49
1

your example can operate in this way : ( with strdup )

char *strcpy(char *destination, const char *source);

int main(){
    char *s = strcpy(strdup("a"), strdup("b"));
    printf("\nThe function ran successfully\n");
    return 0;
}

char *strcpy(char *destination, const char *source){
    printf("in duplicate function strcpy");
    return strdup("a");
}

output :

  in duplicate function strcpy
  The function ran successfully
1

The way to interpret this rule is that you cannot have multiple definitions of a function end up in the final linked object (the executable). So, if all the objects included in the link have only one definition of a function, then you are good. Keeping this in mind, consider the following scenarios.

  1. Let's say you redefine a function somefunction() that is defined in some library. Your function is in main.c (main.o) and in the library the function is in an a object named someobject.o (in the libray). Remember that in the final link, the linker only looks for unresolved symbols in the libraries. Because somefunction() is resolved already from main.o, the linker does not even look for it in the libraries and does not pull in someobject.o. The final link has only one definition of the function, and things are fine.
  2. Now imagine that there is another symbol anotherfunction() defined in someobject.o that you also happen to call. The linker will try to resolve anotherfunction() from someobject.o, and pull it in from the library, and it will become a part of the final link. Now you have two definitions of somefunction() in the final link - one from main.o and another from someobject.o, and the linker will throw an error.
Ziffusion
  • 8,779
  • 4
  • 29
  • 57
1

I use this one frequently:

void my_strcpy(char *dest, char *src)
{
    int i;

    i = 0;
    while (src[i])
    {
        dest[i] = src[i];
        i++;
    }
    dest[i] = '\0';
}

and you can also do strncpy just by modify one line

void my_strncpy(char *dest, char *src, int n)
{
    int i;

    i = 0;
    while (src[i] && i < n)
    {
        dest[i] = src[i];
        i++;
    }
    dest[i] = '\0';
}
Yu Hao
  • 119,891
  • 44
  • 235
  • 294
Saxtheowl
  • 4,136
  • 5
  • 23
  • 32