8

I wanted to transfer elements from a string to another string, and hence wrote the following program. Initially, I thought that the for loop should execute till the NULL character (including it i.e) has been copied. But in this code, the for loop terminates if a NULL character has been found (i.e, not yet copied), but its still able to display the string in which the elements have been copied. How is this possible, if there is no NULL character in the first place?

#include<stdio.h>
#include<stdlib.h>

int main()
{
    char temp[100], str[100];
    fgets(str, 100, stdin);
    int i;
    for(i = 0; str[i]!='\0'; i++)
    {
        temp[i] = str[i];
    }
    puts(temp);
    return 0;
}
nalzok
  • 14,965
  • 21
  • 72
  • 139
Ranjan Srinivas
  • 347
  • 1
  • 3
  • 10
  • 3
    `NULL` is a macro with a _null pointer constant_. This is irrelevant here. You mean the ASCII `NUL` or `nul` character with the integer value `0`. – too honest for this site Feb 20 '16 at 13:36
  • Read about `strcpy`. – Pete Becker Feb 20 '16 at 15:13
  • 1
    @Pete Becker I think this is VERY likely to be something like a homework/tutorial problem (*i.e.* intended/designed to make one **aware** of the issues with copying strings). Simply telling someone to read about `strcpy` doesn't help bring about that understanding. – Tersosauros Feb 20 '16 at 15:27
  • 1
    @Tersosauros - beginners often roll their own functions because they have no idea what's in the standard library. – Pete Becker Feb 20 '16 at 15:31
  • @Ranjan Srinivas, when you ran this, was it on a Linux machine (either real or virtual). [See this answer for more details](http://stackoverflow.com/questions/8029584/why-does-malloc-initialize-the-values-to-0-in-gcc#8029624), but essentially the Linux kernel was zero'ing the memory as it was "new" (newly allocated). Basically, your program **happened** to get a block of memory containing '\0', so `puts` found a `NUL`. This behaviour is NOT guaranteed, and (as the answers are saying) is *undefined behaviour*. – Tersosauros Feb 20 '16 at 15:32
  • @Pete Becker True, but we don't know if this is an issue of a lack of knowledge (and needing `strcpy`), or if this is a student whom is expected to run into (a) certain problem(s) - *i.e.* to prepare for some assignment or lecture, etc. – Tersosauros Feb 20 '16 at 15:33
  • 1
    @Tersosauros - yes, and that's why my comment was a **comment** and not an **answer**. – Pete Becker Feb 20 '16 at 15:35
  • Could be better to use strdup than strcpy, I think. Also, using a != in a for loop would confuse me slightly. Clearer to use while... – jamesqf Feb 20 '16 at 19:04

3 Answers3

7

The void puts(const char *) function relies on size_t strlen(const char *) and output of this function is undefined when there is no null terminator in the passed argument (see this answer). So in your case the strlen inside puts probably found a 0 value 'next to' your array in memory resulting in a proper behavior of puts, however that need not always be the case as it's undefined.

Community
  • 1
  • 1
maciekjanusz
  • 4,702
  • 25
  • 36
7

Here is the input and output on my computer:

0
0
絯忐`

Process returned 0 (0x0)   execution time : 1.863 s
Press any key to continue.

See the garbage "絯忐`"? This is undefined behavior. Your program works well because you're (un)lucky.

Again, undefined behaviors don't deserve much discussion.

nalzok
  • 14,965
  • 21
  • 72
  • 139
  • 3
    I was going to +1, until I read that last line: **"**_undefined behaviors don't deserve a discussion._**"** While it is true the OP was "(un)lucky" as you put it, I think failing to discuss that **undefined behavior** is *failing to answer the question properly*. – Tersosauros Feb 20 '16 at 15:24
  • 1
    Everything and anything can happen when undefined behaviors take place. As K&R wisely point out, "if you don't know how they are done on various machines, that innocence may help to protect you." So I think it is better not to discuss undefined behaviors. – nalzok Feb 20 '16 at 15:30
3

When you declare char temp[100] without initialising it to anything, it just takes uninitialised memory. This memory can be anything. For example, the following program will write out the initial contents of that, as integers:

#include<stdio.h>
#include<stdlib.h>

int main()
{
    char temp[100];
    int i;
    for(i = 0; i < 100 ; i++)
    {
        fprintf(stdout, "%d ", temp[i]);
    }
    return 0;
}

This prints consistently different output for me, though by some fluke it keeps printing sections of zeros. e.g.:

88 -70 43 81 -1 127 0 0 88 -70 43 81 -1 127 0 0 1 0 0 0 0 0 0 0 112 -70 43 81 -1 127 0 0 0 64 -108 14 1 0 0 0 72 50 -13 110 -1 127 0 0 -128 -70 43 81 -1 127 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 96 -70 43 81

88 90 72 88 -1 127 0 0 88 90 72 88 -1 127 0 0 1 0 0 0 0 0 0 0 112 90 72 88 -1 127 0 0 0 -96 119 7 1 0 0 0 72 18 72 105 -1 127 0 0 -128 90 72 88 -1 127 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 96 90 72 88

88 -6 -79 87 -1 127 0 0 88 -6 -79 87 -1 127 0 0 1 0 0 0 0 0 0 0 112 -6 -79 87 -1 127 0 0 0 0 14 8 1 0 0 0 72 34 57 104 -1 127 0 0 -128 -6 -79 87 -1 127 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 96 -6 -79 87

What's most likely happening is that your non-null-terminated string is being accidentally null-terminated by virtue of the fact that temp[strlen(str)] is, by a fluke, \0.

yaakov
  • 5,552
  • 35
  • 48
  • 1
    printing those variables is undefined behaviour – Giorgi Moniava Feb 20 '16 at 14:54
  • @Giorgi How so? AFAIK, it's defined that I get 100 bytes, but the values of those bytes are undefined. They should be safe to print, but it's never guaranteed what they are. – yaakov Feb 20 '16 at 14:55
  • 2
    No please read about the notion of undefined behaviour. Reading uninitialized variables is UB. – Giorgi Moniava Feb 20 '16 at 14:56
  • 1
    `char temp[100];` is an array of `char`. Use the format specifier for a `char` to print them: `fprintf(stdout, "%c ", temp[i]);`. `d` format specifier wants to read 4 bytes. `c` format specifier reads only 1 byte. – ryyker Feb 20 '16 at 16:03