3

Problem : Although I declared two char strings , whose contents are the same , Outputs are different.

#include <stdio.h>

int main(void)
{
    /* Initialization of two different array that We deal with */

    char arr1[10]={'0','1','2','3','4','5','6','7','8','9'};
    char arr2[10]="0123456789";

    /* Initialization End */

    for(int i = 0 ; i < 11 ; ++i)
    {
        printf("arr1[%d] is %c \t\t",i,arr1[i]);
        printf("arr2[%d] is %c\n",i,arr2[i]);

        if(arr1[i]=='\0')
            printf("%d . character is \\0 of arr1 \n",i);

        if(arr2[i]=='\0')
            printf("%d . character is \\0 of arr2 \n",i);
    }

    return 0;
}

Expectation : I expected that both if statements are going to be true for any kind of value of 'i'.

Output : It is an output that I got it.

arr1[0] is 0        arr2[0] is 0
arr1[1] is 1        arr2[1] is 1
arr1[2] is 2        arr2[2] is 2
arr1[3] is 3        arr2[3] is 3
arr1[4] is 4        arr2[4] is 4
arr1[5] is 5        arr2[5] is 5
arr1[6] is 6        arr2[6] is 6
arr1[7] is 7        arr2[7] is 7
arr1[8] is 8        arr2[8] is 8
arr1[9] is 9        arr2[9] is 9
arr1[10] is 0       arr2[10] is 
10 . character is \0 of arr2 
nevzatseferoglu
  • 988
  • 7
  • 18
  • 9
    They aren't the same. `"0123456789"` automatically includes the null-terminator, but your manually constructed string doesn't. – Blaze Feb 04 '19 at 10:46
  • 1
    You're off by one. – Sourav Ghosh Feb 04 '19 at 10:47
  • 3
    `array[10]` is the 11th element in the array. – KamilCuk Feb 04 '19 at 10:47
  • But contents are same , why string that ı declared manually is different – nevzatseferoglu Feb 04 '19 at 10:48
  • 6
    @SeptemberSKY Contents are NOT the same :) Listen to @Blaze - the string literal has an automatically-appended 0-terminator. E.g. the 11th element of `arr2` at idx 10 is 0x00, 0, '\0'. You are wrongly declaring `arr2` to hold 10 characters, but your initializer-string contains 11 (including the implicit zero-terminator) – Morten Jensen Feb 04 '19 at 10:51
  • 3
    In `arr1[10]` you are invoking Undefined Behaviour (reading beyond the array size). Probably the `'0'` you are reading is the first element in the second array `arr2`, which is likely being stored just after `arr1` – alx - recommends codidact Feb 04 '19 at 10:54
  • 2
    In `arr2[10]` it is the same Undefined Behaviour, and you are reading junk. – alx - recommends codidact Feb 04 '19 at 10:59
  • @MortenJensen Do you say that the terminating `\0` will be copied during initialization of that array? I would expect it to be chopped off. This makes the content the same for both arrays. – Gerhardh Feb 04 '19 at 11:20
  • 3
    @Blaze, MortenJensen : the contents of the two arrays are exactly the same. It's well defined to initialize an array from a string literal that is longer than fits in it - the extraneous characters from the string literal are just ignored (in this case the null terminator). – Sander De Dycker Feb 04 '19 at 11:23
  • 4
    @MortenJensen Yes the _string literal_ contains a null terminator but the _array_ `char[10]` has no room to store it. Making the examples equivalent. Sander is correct. – Lundin Feb 04 '19 at 11:42
  • But why `arr1[10]` is printing a `0` and `arr2[10]` don't? As @CacahueteFrito said the `0` here is the first element of `arr2`. Try printing `arr1[11]` using this line: `printf("%c\n", *(arr1+11));` it will print `1` which is the second element in `arr2`. This means that `arr2` is stored right after `arr1` in memory. – Gamal Othman Feb 04 '19 at 12:00
  • An important detail. In C, a valid array index range is 0...(number of elements in array -1 ) So index 10 is beyond the end of the array. Accessing the array via index of 10 results in undefined behavior – user3629249 Feb 05 '19 at 03:30

3 Answers3

6

Both cases invoke undefined behavior by accessing the array out of bounds. You cannot access index 10 of an array with items allocated from index 0 to 9. Therefore you need to change the loop to i<10 or anything might happen. It just happened to be different values printed - because you have no guarantees of what will be printed for the byte at index 10.

In both examples, there is no null terminator, so they are equivalent. Due to a subtle, weird rule in the C language (C17 6.7.9/14 emphasis mine):

An array of character type may be initialized by a character string literal or UTF−8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

Normally when trying to store too many initializes inside an array, we get a compiler error. But not in this very specific case with a string literal initializer, which is a "language bug" of sorts. Change to char arr2[9]="0123456789"; and it won't compile. Change to char arr2[11]="0123456789"; and it will work just fine, even when iterating over 11 elements.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 1
    @Victor This is actually almost a FAQ. I too learned this from SO, just 4 years ago: https://stackoverflow.com/questions/31296727/inconsistent-gcc-diagnostic-for-string-initialization – Lundin Feb 04 '19 at 11:50
  • 1
    Re “But not in this very specific case with a string literal initializer,…”: The rule prohibiting too many initializers does apply to string literals. The 6.7.9 14 text quoted in the answer does not tell us that a string literal is permitted to break this rule or that the rule does not apply to them. Rather, it tells us that a string literal provides its explicit characters as initializers and provides its terminating null character if it fits. If it does not fit, it is not considered to be an initializer, and the rule in 6.7.9 2 is satisfied (if the explicit characters fit). – Eric Postpischil Feb 04 '19 at 14:13
3

There are a few small things wrong with your code and the assumptions you seem to make about it.

1. These two declarations are not the same

char arr1[10]={'0','1','2','3','4','5','6','7','8','9'};
char arr2[10]="0123456789";

The second line is equal to this:

char arr2[10]={'0','1','2','3','4','5','6','7','8','9', 0x00};

... which defines an array containing 11 elements. Check out implicit zero-termination for string literals.

EDIT: I'm getting quite a lot of down-votes for this point specifically. Please see Lundin's comment below, which clarifies the issue.

2. Your for-loop iterates over 11 elements

for(i=0 ; i<11 ;++i)

The loop above goes through i = 0..10, which is 11 elements.... but you only wanted to compare the first 10 right?

You could change your loop to only compare the first ten elements [for(i = 0; i < 10; ++i)] and that would make your program work as you expect.

Because of what it seems you are assuming, I would recommend reading up on strings in C, array-indices and undefined behavior.

Morten Jensen
  • 5,818
  • 3
  • 43
  • 55
  • 3
    You are right that the string literal takes 11 characters, but it is used to initialize an array holding only 10 bytes. The terminator is lost here. – Gerhardh Feb 04 '19 at 11:19
  • 2
    The first point is not an issue. The contents of `arr1` and `arr2` will be the same (`arr2` only contains the first 10 characters from the string literal, not the null terminator). The real issue is the undefined behavior from point 2. – Sander De Dycker Feb 04 '19 at 11:20
  • My point with 1) is that the assumptions made by the programmer does not seem to match the code. – Morten Jensen Feb 04 '19 at 11:22
  • 2
    but it does : the OP assumes that the contents of the two arrays are the same, and that assumption is correct. – Sander De Dycker Feb 04 '19 at 11:24
  • @SanderDeDycker I disagree. I believe the initialization of arr2, writing 1-byte beyond the array, is undefined behavior. You cannot meaningfully compare a well-defined statement with an undefined statement. IMO even if the programmer "meant to invoke undefined behavior", your point is moot. – Morten Jensen Feb 04 '19 at 11:28
  • 4
    Neither are well-defined and your point 1) is wrong. `char arr2[10]="0123456789";` is equal to `char arr2[10]={'0','1','2','3','4','5','6','7','8','9'}` without null terminator. This because of a "language bug" in C that allows a string literal with a size exactly matching the one specified to drop the null termination. – Lundin Feb 04 '19 at 11:31
  • @Gerhardh I believe trying to intialize 10 bytes of allocated storage with 11 bytes invokes array-out-of-bounds-access, which is undefined behavior. It makes no sense to me, to discuss what will actually happen if the compiler accepts the code. – Morten Jensen Feb 04 '19 at 11:31
  • 5
    your belief that such initialization is undefined behavior is wrong : "An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array." – Sander De Dycker Feb 04 '19 at 11:43
  • I always assumed that excess elements were just dropped. But now I can't find anything in the standard about these excess elements. Do you have some source for your assumption that out of bounds access is really happening here? – Gerhardh Feb 04 '19 at 11:43
  • OK. Sander found it. At least wrt to the terminating `\0` byte. – Gerhardh Feb 04 '19 at 11:44
  • 1
    @Gerhardh The normal case of array initialization is 6.7.9/2 "No initializer shall attempt to provide a value for an object not contained within the entity being initialized". So if you have too many initializers it is a constraint violation and the code won't compile. But string literals dodge that constraint, unlike char-by-char initializer lists. – Lundin Feb 04 '19 at 11:48
  • 1. false; 2. true – alk Feb 04 '19 at 12:25
  • @Lundin thanks for clarifying and for bringing a reference to the standard :) – Morten Jensen Feb 04 '19 at 12:40
0

when a character array is initialized with a double quoted string and array size is not specified, compiler automatically allocates one extra space for string terminator ‘\0’

Ref

4b0
  • 21,981
  • 30
  • 95
  • 142
hrishi007
  • 113
  • 2
  • 7