2

My code is crashing because of a lack of the char '\0' at the end of some strings.

It's pretty clear to me why we have to use this termination char. My question is, is there a problem adding a potential 2nd null character to a character array - to solve string problems?

I think it's cheaper just add a '\0' to every string than verify if it needs and then add it, but I don't know if it's a good thing to do.

alk
  • 69,737
  • 10
  • 105
  • 255
Rômulo M. Farias
  • 1,483
  • 1
  • 15
  • 29
  • 7
    Having `'\0'` is not a problem, unless you have not gone out of bounds of that char array. You do have to understand that, having `'\0'` twice would mean, any string operation would not even know that there is a second `'\0'`. They will just read till the first `'\0'`, and be with it. – Haris Jun 09 '17 at 14:44
  • 3
    The `'\0'` (or NUL) char terminates a string. So by definition, anything that comes after that NUL char isn't part of the string no matter what. – Jabberwocky Jun 09 '17 at 14:46
  • @RômuloM.Farias, realized it after hitting enter :) – Haris Jun 09 '17 at 14:47
  • 5
    It's better to solve the problem properly and work out why some of your strings are not properly terminated. Double nuls won't do much harm, but they will make maintenance a headache. – Malcolm McLean Jun 09 '17 at 14:51
  • 5
    What you suggest is poor practice. Just construct the strings correctlyin first place with a NUL terminator, period. – Jabberwocky Jun 09 '17 at 14:51
  • 4
    How do you verify whether a string has a `\0` at the end or not? There's no way to do that. – Spikatrix Jun 09 '17 at 14:52
  • 2
    I'm with @MalcolmMcLean -- if you're in a situation where you *think* the solution is to append `'\0'` everywhere (btw, how do **you** know where the string ends without a `'\0'`?), you might in fact have an [xy-problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) –  Jun 09 '17 at 14:59
  • @CoolGuy Detail: In C, a _string_ always has a _null character_ ending it. A character array might not have a `'\0'`. – chux - Reinstate Monica Jun 09 '17 at 14:59
  • 2
    I won't add another comment trying to convince you that you do not have a problem with one. Instead, I want to be convinced that you have a problem for one but not for two. Please make a [mcve] which demonstrates a crash for string without two `\0`. I understand you have case which crashes for a single `\0` and does not crash for `\0\0`. I.e. please demonstrate that it crashes because of a single `\0` instead of two and not because of none instead of two. – Yunnosch Jun 09 '17 at 15:08
  • 1
    @Yunnosch sorry but the problem isn't a crash with just one '\0'. The crash is when there isn't a termination char. So my solution would add '\0' always, and when a char array already had '\0', it would be duplicated. But reading some answers here, I've decided to look for the root problem, not just add the char and make a poor code. – Rômulo M. Farias Jun 09 '17 at 16:02
  • That is what I assumed. Good decision to look for the root cause. Making an MCVE (even if you are not going to post it here) will be the perfect way of doing so. – Yunnosch Jun 09 '17 at 16:15

3 Answers3

5

is there a problem to have this char ('\0') twice at the end of a string?

This question lacks clarity as "string" means different things to people.
Let us use the C specification definition as this is a C post.

A string is a contiguous sequence of characters terminated by and including the first null character. C11 §7.1.1 1

So a string, cannot have 2 null characters as the string ends upon reaching its first one. @Michael Walz

Instead, re-parse to "is there a problem adding a potential 2nd null character to a character array - to solve string problems?"


A problem with attempting to add a null character to a string is confusion. The str...() functions work with C strings as defined above.

// If str1 was not a string, strcpy(str1, anything) would be undefined behavior.
strcpy(str1, "\0");  // no change to str1

char str2[] = "abc";
str2[strlen(str2)] = '\0'; // OK but only, re-assigns the \0 to a \0
// attempt to add another \0
str2[strlen(str2)+1] = '\0'; // Bad: assigning outside `str2[]` as the array is too small

char str3[10] = "abc";
str3[strlen(str3)+1] = '\0'; // OK, in this case
puts(str3);                  // Adding that \0 served no purpose

As many have commented, adding a spare '\0' is not directly attending the code's fundamental problem. @Haris @Malcolm McLean

That unposted code is the real issue that need solving @Yunnosch, and not by attempting to append a second '\0'.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • I **so** agree, actually that is precisely why I asked for a MCVE, with the detail to elaborate. Seems my comment was not clear. – Yunnosch Jun 09 '17 at 16:16
  • @Yunnosch I won't post a MCVE, sorry about that. The code is of the enterprise that I work for, and I can't do it. If you really wan't to understand how the problem happens, we found it using [Asan](https://en.wikipedia.org/wiki/AddressSanitizer) . It changes the memory and stress my software to find potential bugs. – Rômulo M. Farias Jun 09 '17 at 16:31
3

I think it's cheaper just add a '\0' to every string than verify if it needs and then add it, but I don't know if it's a good thing to do.

Where would you add it? Let's assume we've done something like this:

char *p = malloc(32);

Now, if we know the allocated length, we could put a '\0' as the last character of the allocated area, as in p[31] = '\0'. But we don't how long the contents of the string are supposed to be. If there's supposed to be just foobar, then there'd still be 25 bytes of garbage, which might cause other issues if processed or printed.

Let alone the fact that if all you have is the pointer to the string, it's hard to know the length of the allocated area.

Probably better to fix the places where you build the strings to do it correctly.

ilkkachu
  • 6,221
  • 16
  • 30
2

Having '\0' is not a problem, unless you have not gone out of bounds of that char array.

You do have to understand that, having '\0' twice would mean, any string operation would not even know that there is a second '\0'. They will just read till the first '\0', and be with it. For them, the first '\0' is the Null terminating character and there should not be anything after that.

Haris
  • 12,120
  • 6
  • 43
  • 70