2

First question is about null char \0 at the end of string, there are so many variations in terms of when \0 is needed/auto-added. When declaring a char array, do I need to specify \0 or no? or in what case, should I specify \0, what case not? Can someone give a comprehensive summary? (as in this post). if you feel my question is ambiguous, then a more specific one is when declare a string in C, what is the best way, is it char string[] = "first string", because for example, in this way, I can do strcat(string, another_string) without concern about size problem?

Second question: I have

1   char a[] = "kenny";
2   char b[3];
3   strncpy(b, a, (int)(sizeof(b) - 1));
4   printf("%i\n", (int)sizeof(b)); // 3
5   printf("string length: %i\n", (int)strlen(b)); // string length: 8
6   printf("%s\n", b); // give me random stuff like kekenny or keEkenny 
  • 3: I only want to pass 2 bytes to b
  • 4: sizeof behaves normally
  • 5: but why does it become 8???
  • 6: why does it give me random stuff like kekenny or keEkenny

I just got lost what is happening in C string. I used to use C++ a lot but still can't understand how C string behaves.

Community
  • 1
  • 1
Kenny Wang
  • 69
  • 1
  • 7
  • 1
    No, you can't "do `strcat(string, another_string)` without concern about size problem". There is no room for the concatenation. – Weather Vane Feb 06 '19 at 15:27
  • In order `printf` could print the string, it should be a proper null-terminated string. But your code is purposely truncating the null-terminator. – Eugene Sh. Feb 06 '19 at 15:28
  • 1
    your code will exhibit _[undefined behavior](http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html)_. It will perhaps sometimes work, then suddenly it will not. – ryyker Feb 06 '19 at 15:29
  • @WeatherVane thats why i did `strncpy(b, a, (int)(sizeof(b) - 1));` because i concern size problem but it still gives me random stuff in later code? – Kenny Wang Feb 06 '19 at 15:29
  • The example in your narrative is `char string[] = "first string"; strcat(string, another_string);` There is no room because `string` is exactly 13 bytes in size, ending with the `'\0'` terminator. – Weather Vane Feb 06 '19 at 15:32
  • @ryyker 3, thats how much memory i gave it to – Kenny Wang Feb 06 '19 at 15:33
  • @WeatherVane if i do `char a[] = "kenny"; char b[] = " confused"; strcat(a, b); printf("%s\n", a);` it still print out `kenny confused`, so seems like original 6 bytes of 'a' is ignored. – Kenny Wang Feb 06 '19 at 15:37
  • When the string is defined with double quotes (e.g. "abc"), the null byte `\0` is automatically added. `"abc"` will be compiled as 4 bytes (hex): `61 62 63 00`. – i486 Feb 06 '19 at 15:38
  • Well that was plain unlucky that `b` just happened to follow `a` in memory. It saved you being tipped off to the error by, say, a segfault. Note about `strcat`: *"The behavior of strcat is undefined if the source and destination strings overlap."*. – Weather Vane Feb 06 '19 at 15:38

5 Answers5

4

By definition, in the statement:

char string[] = "first string"

string is populated with precisely all the content it can hold:

In memory it looks like this:

|f|i|r|s|t| |s|t|r|i|n|g|\0|?|?|?|
/// end of legal memory    ^

...illustrating why the following statement:

strcat(string, anythingElse);

is undefined behavior. ( Otherwise known by some as nasal demons. )

Also, regarding strncpy(,,) usage. Because it is not guaranteed to contain a nul character after its use, it is recommended to always explicitly append the nul to the proper location in the new string:

strncpy (target, source, n);
target[n] = 0;

Where in the case of your example, n == (sizeof(b) - 1)

Note your cast to (int) is not needed in the above expression when using sizeof as the type of the 3rd parameter to strncpy(,,*) is size_t:

char *strncpy (char Target_String[], const char Source_String[], size_t Max_Chars);

Usage for strncat on the other hand, does append the nul character to the end of the resultant target string, negating the need to explicitly append a nul.

ryyker
  • 22,849
  • 3
  • 43
  • 87
  • thx a lot! but when i do `char a[] = "kenny"; char b[] = " confused"; strcat(a, b); printf("%s\n", a);` it still print out `kenny confused`, so seems like original 6 bytes of `a` is ignored somehow – Kenny Wang Feb 06 '19 at 15:50
  • @KennyWang - Carefully read the links I left describing _undefined behavior_. – ryyker Feb 06 '19 at 15:52
  • You should be more explicit about `strncpy`: https://randomascii.wordpress.com/2013/04/03/stop-using-strncpy-already/ – chqrlie Apr 06 '19 at 07:26
  • @chqrlie - If you are suggesting that I discourage use of the _r_ string functions, (inferred by the link you provided.) I am not in agreement with what the author is suggesting about _not_ using the `r string` functions. I noticed in other comments that you agree with him, but I have found them very useful on many occasions when size of destination string is fixed, but size of string to copy is unknown. However, I do appreciate your position on the topic. Thanks. – ryyker Apr 10 '19 at 13:18
  • @ryyker: I'm not sure which `r` string functions you refer to, I am specifically annoyed by `strncpy`, not so much with `strncat` and the article in reference has too much of a C++ bias, he is not proposing a simple alternative to `strncpy` with intuitive semantics such as `snprintf(dest, size, "%s", src);`: Copy the string if it fits and otherwise truncate it, but always terminate it with a null byte. – chqrlie Apr 10 '19 at 15:56
  • @chqrlie - I know this is a little delayed, but I took another look at the link you provided above. (3rd comment from top.) This time I perused the content and comments (with coffee this time :)), and truly enjoyed the volume and quality of response traffic the post received. I also found the alternative he proposes for `strncpy()` interesting, but then realized its application is limited to `C++` usage. I did not know if you were aware of that. Have you run across alternatives to the `strncpy()` function implemented in `C` that you are partial to? Again, I appreciate your perspective. – ryyker Aug 14 '19 at 15:06
  • ... @chqrlie - btw, I have read _[this](https://stackoverflow.com/questions/41869803/what-is-the-best-alternative-to-strncpy)_. If you do know of a custom function solution as a replacement, that would be my preference over non-standard library alternatives. – ryyker Aug 14 '19 at 15:10
  • 1
    @ryyker: I updated my answer https://stackoverflow.com/a/41885173/4593267 to the question with more custom functions which I found useful in my projects. – chqrlie Aug 15 '19 at 09:37
4

The thing about C strings is that they're pretty low-level, and there are a number of extra things you have to keep in mind, and sometimes do "by hand".

(C++ std::strings, by contrast, are just about completely normal, high-level types.)

In answer to your specific questions:

You almost never need to supply a \0 explicitly. Just about the only time you do is when you're building a string completely by hand. For example, this code works:

char str[10];
str[0] = 'c';
str[1] = 'a';
str[2] = 't';
str[3] = '\0';
printf("%s\n", str);

But if you leave out the explicit assignment to str[3], it will behave erratically. (But if you don't create strings by hand like this, you don't need to worry so much.)

You must be extremely careful when copying strings with strcpy. You must ensure that the destination string ("buffer") is big enough. Nothing in C will ever take care of this for you -- nothing makes sure the destination is big enough; nothing warns you if it's not big enough. But if it's not big enough, the strangest things can happen -- including that it seems to work, even though it shouldn't. (The formal name for this is "undefined behavior".)

In particular, if you write

char string[] = "first string";
strcat(string, another_string);

what you have got is a bug, pure and simple. It is not true that "in this way you have no concern about size problem". When you say char string[] = "...", the compiler sizes the string just big enough to hold the initializer (and its \0), in this case 13 bytes for "first string". The [] does not mean "make this string big enough for any text I'll ever try to shove into it".

You must be even more careful when using strncpy. In fact, my recommendation is to not use strncpy at all. What it actually does is unusual, special, difficult to explain, and usually not what you want anyway. (For one thing, if you have it copy less than a full string, it doesn't add a `\0' to the destination, which helps explain why you got things like "kekenny".)

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • thx!! so when i do `char str[10] = {'a', 'b'. 'c'};` do i need to explicitly hand-adding `\0` to the end? – Kenny Wang Feb 06 '19 at 15:47
  • If you do it that way, yes. See also [this old question](https://stackoverflow.com/questions/31296727/). – Steve Summit Feb 06 '19 at 15:49
  • `char a[] = "kenny"; char b[3]; strcpy(b, a); printf("%s\n", b);` it still gives me `kenny` even i didn't assign enough memory, do u know what happened? plus, when use `strcpy`, will `\0` be automatically added to the end of `dest`? – Kenny Wang Feb 06 '19 at 17:03
  • Yes, `strcpy` appends `\0`. But as I said, you must be sure the destination buffer is big enough. If it's not (and of course 3 is not big enough for "Kenny"), weird and unexplainable things can happen. (The formal definition is "undefined behavior".) And "weird and unexplainable" definitely includes "it seems to work even though you didn't expect it to". – Steve Summit Feb 06 '19 at 18:46
  • +1: **my recommendation is to not use `strncpy` at all. What it actually does is unusual, special, difficult to explain, and usually not what you want anyway.** – chqrlie Apr 06 '19 at 07:29
2

First question

When you do

char string[] = "first string";
            ^
            No size specified

the compiler will reserve memory that can hold exactly the text "first string" and a NUL termination. If you print the size of the string, you'll get 13. In other words - the variable can not hold further data so it is meaningless to concatenate another string.

You could do:

char string[100] = "first string";

and then you can concatenate another string.

Second question

First thing to know is that strings in C are char-arrays that contains a NUL termination.

When you do:

char b[3];

you get an uninitialized array, i.e. b can contain anything - like b = { ? , ? , ? }

Then you do:

strncpy(b, a, (int)(sizeof(b) - 1));

meaning that you copy the 2 first characters from a to b.

So now we know the b is b = { 'k' , 'e' , ? } Notice that the third character of b is still uninitialized.

So when you do:

printf("string length: %i\n", (int)strlen(b));
printf("%s\n", b);

you use b as if it is a string but it isn't. There is no NUL termination. Consequently the functions (printf, strlen) gives incorrect results. Calling these function with a char array without a NUL termination is undefined behavior, i.e. anything can happen.

What seem to happen is two things:

a) The uninitialized character in b just happens to be an 'E' (in one of your examples)

b) The string literal "kenny" just happens to be located in memory right after variable b.

So the two string function really sees the string "keEkenny" which has the len 8.

To fix this you can do:

strncpy(b, a, (int)(sizeof(b) - 1));
b[sizeof(b) - 1] = '\0';

or simply do:

char b[3] = { 0 };

as this will initialize all of b, i.e. b = { '\0' , '\0' , '\0' }

Support Ukraine
  • 42,271
  • 4
  • 38
  • 63
1

If you read the documentation for strncpy it quite clearly states that it won't add a NUL terminator if the size you specify doesn't include it:

The strncpy() function is similar, except that at most n bytes of src are copied. Warning: If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated.

So in the following case, you're only copying 2 characters and neither of them are the NUL terminator so you need to add it yourself.

strncpy(b, a, (int)(sizeof(b) - 1));
Chris Turner
  • 8,082
  • 1
  • 14
  • 18
0

You have to add the string terminator \0 to the b, in a way or another. The printf("%s\n", b) will stop when it will find the \0.

It depends on what you have on memory, sometimes seg fault is to be expected.

vasile_t
  • 372
  • 2
  • 12