2

Below is the program to copy the one string to another. I would expect the following program to give me a warnig or an error but it works just.

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
void strcp(const char *source,char *dest)
{
    while(*source!='\0')
    {
        *dest++=*source++;
    }
//  *dest='\0';
}
int main(void)
{
    char source[]="This is a test string";
    int len=strlen(source);
    char *dest = (char *)calloc(len,sizeof(char));
    strcp(source,dest);
    printf("\n The destination now contains ");
    printf("%s",dest);
    return 0;
}

Here I ve commented out *dest='\0' So *dest doesnot contain any null character at the end of the string But how is then the printf statement in the main function working fine cause I guess all function which involves string rely on the '\0' character to mark the end of string ?

P.S. I got the answer of my first question but none of the answer talked about this question below And also I found it strange that i could use pointer dest directly with %s specifier in the printf I just wrote the last printf in the main to check will it work or not cause earlier i was using

char *temp=dest
while(*temp!='\0')
{
   printf("%c",*test++);
}

In place of printf("%s",dest)

  • DO not use `calloc()` for this situation, you don't need to initialize all items to `0` because you are going to overwrite them immediately, and using `calloc()` may hide bugs from memory debuggers, also [Do not cast it's return value](http://stackoverflow.com/a/605858/1983495) – Iharob Al Asimi Jun 08 '15 at 12:52
  • 1
    Your code invokes Undefined Behavior. It means that anything can happen. It need not necessarily be a crash or a segfault or anything else. And [In C, don't cast the result of `malloc`(and family)](http://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc) – Spikatrix Jun 08 '15 at 12:54
  • @CoolGuy Is undefined behaviour cause of commenting out *dest='\0' ? –  Jun 08 '15 at 12:55
  • Yes. But even if you uncomment it, your code still exhibits UB since you write the `\0` into an invalid memory location. See the answers below. – Spikatrix Jun 08 '15 at 12:57
  • just one more thing : I'm not sure about the predecence of `*dest++=*source++;`. the suffix `++` is evaluated *before* the `*` operator according to here : http://en.cppreference.com/w/cpp/language/operator_precedence. I'm not sure about it , but first you increment the pointer, then dereference it with should be the other way around.. – David Haim Jun 08 '15 at 13:00
  • 1
    @DavidHaim but the _effect_ of `++` is sequenced after the evaluation of the expression, isn't it? – Natasha Dutta Jun 08 '15 at 13:02
  • Well the expression is bounded as *(dest++)=*(source++) which mean that first the value at present location of source will be stored at present location of dest and then dest will be incremented and after that source will be increamnetd and the process repeats –  Jun 08 '15 at 13:03
  • Please note that it is convention/industry de facto standard in C to write custom copy functions as `void skunkcpy (type* dest, const type* source);` with the destination first and the source second. – Lundin Jun 08 '15 at 13:03
  • 3
    @DavidHaim Although ugly and bad practice, copy lines such as `*dest++=*source++;` are common in C, and also well-defined. It is completely equivalent to `*dest = *source; dest++; source++;`. And because the latter form is completely 100% equivalent, one should use the latter so one doesn't confuse the reader. It is good programming practice never to mix the ++ operators together with other operators, because in many other cases, doing so leads to bugs and undefined behavior. – Lundin Jun 08 '15 at 13:06
  • so how does that sit with the page I was refering to? is it C++ vs C thing? becuase if you have a mixture of 2 operator on the same variable, to compiler will evaluate the predecence one before, meaning that `*x++` will be evaluated as `*(x++)`,becase `++` comes before `*`? I'm asking here, because I'm note so sure myself. – David Haim Jun 08 '15 at 13:42
  • here: in C++, I try this :`struct Example{ Example& operator ++ (int){ std::cout << "++\n"; return *this; } void operator * (){ std::cout << "*\n"; } };` for `Example e; *e++ ` - output : ++ (new line) *. – David Haim Jun 08 '15 at 13:49
  • It doesn't matter if some c++ subtlety makes this code wrong, there are others too, and indeed this is common in c you can check some of the linux manual pages which contain examples with similar code (_I just don't remember which one might..._), that's why many people like myself, hate c++. – Iharob Al Asimi Jun 08 '15 at 13:51
  • @DavidHaim: The `*dst++ = *src++` notation is well-defined and idiomatic C. The post-increment is evaluated first; it 'returns' the original value of the pointer and then increments the variable. The unincremented value is then used in the assignment operation. The classic string copy loop is `while (*dst++ = *src++) ;` which ensures that the null terminator is copied too, unlike the code in the question (which also does not allocate enough space). – Jonathan Leffler Jun 08 '15 at 13:57

2 Answers2

4

Because undefined behavior is undefined?

By omitting the terminating '\0' and then passing the "string" to printf() with an "%s" specifier, your program invokes undefined behavior.

I write "string" above, because without the terminating '\0' it's not a string, it's just a sequence of bytes, when you terminate a sequence of bytes with a '\0' then, it becomes a string.

Since you can't predict what is going to happen, one of the things that could happen is that it works.

Although your program has more issues

  1. You are allocating less space than required. You need strlen() + 1, because of the '\0' terminator, this is the real cause of undefined behavior in your program.

  2. You are calling calloc() wrong, it should be

    dest = calloc(1, 1 + len);
    

    note that I don't write sizeof(char) because it's superflous, it's always 1, it has to be as it's mandated by the c standard.

    In your version, you are allocating one item of size len, when you really want len items of size 1.


Notes and Recommendations:

  1. DO not use calloc() for this situation, you don't need to initialize all items to 0 because you are going to overwrite them immediately, and using calloc() may hide bugs from memory debuggers.

  2. Do not cast void * in c.

Community
  • 1
  • 1
Iharob Al Asimi
  • 52,653
  • 6
  • 59
  • 97
  • Just asking, as the memory is "zeroed" by `calloc()`, is not null-terminating an issue here? – Natasha Dutta Jun 08 '15 at 12:57
  • It's still undefined behavior because there is no space for the `'\0'`, and anyway I think it's good to explain why no one can expect anything from undefined behavior to be defined. – Iharob Al Asimi Jun 08 '15 at 12:59
  • I agree with you in case of less memory allocation, but if used `calloc()` with enough memory to hold the string and the null-terminator properly. I don't see the reason why an explicit null-terminator is required. Can you reply please? – Natasha Dutta Jun 08 '15 at 13:00
  • @NatashaDutta If the `calloc()` ed memory was large enough to hold the string (A valid string in c which is null terminated ) then there was no UB. Since that is no the case we see UB here – Gopi Jun 08 '15 at 13:01
  • @Gopi Right. that is what I was trying to confirm. Thank you for clarification. – Natasha Dutta Jun 08 '15 at 13:03
  • 2
    While using calloc() is needless here, how does it "hide" bugs from memory debuggers? – P.P Jun 08 '15 at 13:05
  • @Gopi Initializing the buffer twice is inefficient, don't you think? – Iharob Al Asimi Jun 08 '15 at 13:06
  • @BlueMoon If your algorithm has issues and it fails to initialize some bytes, and you use `calloc()` you will never know. For example, [valgrind](http://www.valgrind.org) notifies about uninitialized bytes at runtime, which could cause undefined behavior if are uninitialized or even if they are initialized with `0`, since you probably expect a non-zero value. – Iharob Al Asimi Jun 08 '15 at 13:07
  • Valgrind usually complains about *using* uninitialized variables/bytes. Leaving bytes uninitialized to find possible algorithmic bugs is not the right way. Can you give an example where using calloc() makes valgrind complain while leaving it uninitialized (say, using malloc) doesn't? – P.P Jun 08 '15 at 13:33
  • @BlueMoon I am sorry, but I didn't say that, I said that forcing the initialization can hide bugs, some times you type `<` instead if `<=` just because you was typing fast and didn't notice, then suddenly valgrind complains about uninitialized bytes, so you wonder??? what's wrong? and when you check you say _oh! stupid me, it's `<=` not `<`_, that's an example there might be other situations. And you are right about the fact that this would not be an algorithmic issue, but perhaps there can be a situation where it's your algorithm that is wrong. – Iharob Al Asimi Jun 08 '15 at 13:48
2

All strings in C have a special terminating character '\0', that is not counted when using strlen. However, all strings must still have this special termination character, or all string functions will go past the end of the string.

When allocating memory for a string, you need to add space for this terminating character, as well as adding it last.

So instead of allocating only len characters, you need to allocate len + 1 characters. And remove the comment in your strcp function, it's needed to add the terminator.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621