1

I have a question about how is the correct way of manipulate the initialization of c strings For example the next code, isn't always correct.

char *something;
something = "zzzzzzzzzzzzzzzzzz";

i test a little incrementing the number of zetas and effectively the program crash in like about two lines, so what is the real size limit in this char array? how can i be sure that it is not going to crash, is this limit implementation dependent? Is the following code the correct approach that i always must use?

char something[FIXEDSIZE];
strcpy(something, "zzzzzzzzzzzzzzzzzzz");
miku
  • 181,842
  • 47
  • 306
  • 310
mjsr
  • 7,410
  • 18
  • 57
  • 83

4 Answers4

8

As you say, manipulating this string leads to undefined behaviour:

char *something;
something = "zzzzzzzzzzzzzzzzzz";

If you are curious as to why, see "C String literals: Where do they go?".

If you plan to manipulate your string at all, (i.e. if you want it to be mutable) you should use this:

char something[] = "skjdghskfjhgfsj";

Otherwise, simply declare your char * as a const char * to indicate that it points to a constant.

In the second example, the compiler will be smart enough to declare this as an array on the stack of the exact size to hold the string. Thus, the size of this is limited by your stack.

Of course, you will likely want to specify the size anyway, since it is usually useful to know when manipulating the string.

Community
  • 1
  • 1
Chris Cooper
  • 17,276
  • 9
  • 52
  • 70
1

The first example is only incorrect in that char *something should really be const char *something. Otherwise, this:

const char *something = "fooooooooooooooooooooooobar";

...should work, and should not crash.

char something[FIXEDSIZE];

...this one, however, can typically crash with a stack overflow if you, well, overflow the stack, which depends on how big that stack is, how big that array is, where this gets called, etc.

Thanatos
  • 42,585
  • 14
  • 91
  • 146
  • can you explain me a little more, what is the reason of the const? – mjsr Jul 20 '10 at 23:59
  • @BC, it might be a stack if the declaration is an automatic variable. Since the OP didn't provide much context, we don't know whether the declarations are at file scope or at some smaller scope within a function. – RBerteig Jul 21 '10 at 00:11
  • 'const' informs the compiler that you promise not to modify the memory pointed to by 'something'. If you make this change, you should get a warning message on the code that is actually *wrong*, which is where you are now crashing. – zwol Jul 21 '10 at 00:13
  • 1
    @voodoomsr, `const char *` is the type of a constant string such as `"abc"`. It represents a promise that the characters point at by `something` are immutable. It allows a compile-time check for errors caused by code like `something[3] = 'd'` (which probably should have been `something[3] == 'd'`) to fail on the grounds that you can't do that because you said so. – RBerteig Jul 21 '10 at 00:14
  • In summary, both ways are always correct. If i know that i'm not going to modify the string i should use the const keyword just for compile-check and better semantics of my program but not because is mandatory. I'm really Ok? i still have a contradiction in my mind....i wrote a program that ask for a string using scanf("%s",something. With real big strings crash, with little strings works fine...Why? – mjsr Jul 21 '10 at 00:35
  • @voodoomsr: To clarify: while a compiler may not warn you about assigning a string literal to a `char *` (ie, `char *s = "foo"`), that pointer is still *read only*. It is not valid to write to `s` in my example. On the `scanf` question: Because `something` must have a length when it is passed to `scanf`, and it cannot know beforehand how many characters I will enter, I can always enter more than the length of the buffer, thus, overflowing it, and writing data to spots in memory where I shouldn't. (`scanf("%s", something);` is typically vulnerable to buffer overflows.) – Thanatos Jul 21 '10 at 04:38
  • Thanks Thanatos, so in conclusion i could use scanf("%s", something) without telling the compiler the size of something, but doing that is a risk, because if i pass more characters than the buffer it will crash. In the other situation if i have a char *something; and then in other line i initialize his value directly in the code, something ="zzzzzzzzz" there will be no problem with the number of zetas. – mjsr Jul 21 '10 at 15:09
1

The second is always correct.

The first is correct only if you never change the string, since you've assigned a pointer to fixed data.

Joel
  • 5,618
  • 1
  • 20
  • 19
  • My mistake. (And apparently someone else's too =P) I down-voted because I think this answer doesn't address the OP's actual questions (i.e. "what is the real size limit in this char array?" and "Is the following code the correct approach that i always must use?"). Also, the OP already stated that the first is only sometimes correct. But I guess he marked it right, so it must have satisfied him! – Chris Cooper Jul 22 '10 at 01:57
-1

first should never crash. second will crash as soon as the number of 'z' + 1 go over the available space on the stack page, or if you try to return from the function.

vlabrecque
  • 346
  • 1
  • 4
  • The second one might crash if you invoke undefined behavior somehow by overflowing the stack, but "returning from the function" is no more related to the crash then adding two variables. – Thanatos Jul 20 '10 at 23:52
  • It'd be pretty related if you were to assign 100 chars to a 30-char array. The return address would more than likely have been clobbered by the overflowing string -- that's how buffer overflows work, and they used to be like the #1 way of breaking into servers and such. Random contents cause a crash; specially malformed contents can get you root/system access. – cHao Jul 20 '10 at 23:57
  • well, in my test the first one crash with a real big string – mjsr Jul 20 '10 at 23:58
  • @cHao: Ah, you are right... I always visualize stack stuff incorrectly. Nonetheless, this is an implementation, albeit of a very popular implementation. – Thanatos Jul 20 '10 at 23:59
  • That said, I don't really feel like "assign 100 chars to a 30-char array" is the problem at hand here... – Thanatos Jul 21 '10 at 00:01
  • @thanatos: did you just(1) say it could crash with a stack overflow in your answer (2) not understand what that means and (3) critique my explanation of a stack overflow was "unrelated"? (as well as probably voting down my answer since you are the only critic) Without a correct explanation of what the poster actually _did_ rather than what he think he did, there's not much that can be said. – vlabrecque Jul 21 '10 at 14:09