2

I was learning the basics of C programming , and I wanted to test some lines for strings.

This is my code:

int main(){
   char a[] = "abc";
   strcpy(a,"pqrst");
   printf("%s; %d",a, sizeof(a));
}

I expected the code to output size=6 (p, q, r, s, t and '\0'), but instead, it still prints size=4. How does this work?

Joel
  • 4,732
  • 9
  • 39
  • 54
  • 2
    Your `strcpy`-call has a buffer-overrun (the source is two byte longer than the destination) => UB => everything's over. – Deduplicator Aug 07 '14 at 21:10
  • Use strlen to determine length of a string. sizeof is compile-time value, it just replaced by size memory, "allocated" for "a" variable. And strcpy causes buffer overrun, of course. – Vadim Kalinsky Aug 07 '14 at 21:13
  • it is size 4 because that is its real size, the 3 characters you put into it + the null character '\0'. you can't re-size an array, it has a fixed length. –  Aug 07 '14 at 21:16
  • Hey, thank you for your reply. But from where is it printing the other bytes if a is only 4 byte long. In the output, I got pqrst; 4. How s this possible? How are the other two characters accessed? – user3920047 Aug 07 '14 at 21:17
  • 1
    @user3920047: That is part of the wonderful world of __undefined behavior__. One possible outcome of undefined behavior is the behavior that you expect. But it can also change wildly due to relatively unrelated code elsewhere in your program. In this case, you're overwriting what would be other variables on the stack. – Bill Lynch Aug 07 '14 at 21:18
  • "pqr" were added to the array, "st" are actually beyond the array legal space, invading memory –  Aug 07 '14 at 21:18
  • it prints an array until it finds the null char '\0', that normally goes at the end of it –  Aug 07 '14 at 21:20
  • @blade: There's no reasoning with UB, I'm sad to report. You who go there, let go of all hope. – Deduplicator Aug 07 '14 at 21:26
  • @sharth okk.. so basically this time i got lucky not to get error for accessing out of bound array index.. Thanks :) – user3920047 Aug 07 '14 at 21:34

4 Answers4

4

sizeof is computed at compile time, based on the declaration of a, which has 4 characters (3 + 1 null terminator). It should be noted sizeof an array and length of the string in an array aren't the same thing.

Moreover, the copy has overflowed the buffer. You have to create a large enough array to hold the string you want to copy over.

Ben
  • 2,065
  • 1
  • 13
  • 19
3

sizeof(a) is evaluated at compile-time. The type of a, partially determined from the char a[] part of the declaration and partially from the "abc" initializer, is “array of 4 chars”, therefore sizeof(a) evaluates to 4. The value of the elements of a have no influence on the result.

Incidentally, the strcpy call in your program causes a buffer overflow. Extra characters are written somewhere in memory and may cause unpredictable behavior.

If you copied the string "z" to a with strcpy(a, "z");, there would be no undefined behavior, strlen(a) would then evaluate to 1, but sizeof(a) would still be 4.

Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
  • Yea because strlen measures till the \0 character. But, why didn't C return error of attempting to access array index that was out of bounds? Thank You. – user3920047 Aug 07 '14 at 21:26
  • 2
    in C there's no array bound checking. – macfij Aug 07 '14 at 21:27
  • 1
    @macfij: At least implementations holding your hands that way are really few and far between (It's not forbidden, and there are some, mostly to test for such errors). – Deduplicator Aug 07 '14 at 21:28
  • 1
    @user3920047 First, the C compiler does not store extra information about `a` that would allow to know its size. `sizeof(a)` has been transformed into `4` at compile-time, but when the program is running there is no link between `a` and `4`. Second, the compiler does not generate extra instructions to check that reads and writes from `a` are in-bounds. Most C compilers generate code that is space- and time-efficient by avoiding the storage of such metadata and the checks that would detect the misuse of `a`. – Pascal Cuoq Aug 07 '14 at 21:32
1

Your strcpy-call has a buffer-overrun (the source is two byte longer than the destination), leading to undefined behavior (UB).

Invoking UB means there's nothing left to reason about, on any execution path invoking it (that includes all paths here), even before you get to it.

If you fix the UB, sizeof is evaluated at compile-time for all but VLAs, giving size of the argument: Array of (3 elements "abc" + 1 implicit terminator "\0") char.

Community
  • 1
  • 1
Deduplicator
  • 44,692
  • 7
  • 66
  • 118
1

This line

char a[] = "abc";

creates space on the stack for a string of 4 characters. It's the same as doing:

char a[4] = "abc";

When you do:

strcpy(a, "pqrst");

It basically does:

int len = strlen("pqrst") + 1;

for (int i=0; i<len; ++i)
    a[i] = "pqrst"[i];

Clearly, that code will overwrite the bounds of the a array.


Basically though, it sounds like you're expecting C to do extra work for you. That's the opposite of what C will do.

Bill Lynch
  • 80,138
  • 16
  • 128
  • 173