2

How are they different from non-null-terminated strings? What is this null that terminates the string? Is it different from NULL? Should I null-terminate my strings myself, or the compiler will do it for me? Why are null-terminated strings needed? How do I set up my code/data to handle null-terminated strings?

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • C does not have _non-null-terminated strings_. – chux - Reinstate Monica May 30 '22 at 23:56
  • Rather than closing this as a dupe, I would propose that you migrate your answer here to [How should character arrays be used as strings?](https://stackoverflow.com/questions/58526131/how-should-character-arrays-be-used-as-strings), which should be possible without changing it too much. It fits in quite well with the other answers there. – Lundin Jan 19 '23 at 10:49

1 Answers1

6

What are null-terminating strings?
In C, a "null-terminated string" is a tautology. A string is, by definition, a contiguous null-terminated sequence of characters (an array, or a part of an array). Other languages may address strings differently. I am only discussing C strings.

How are they different from a non-null-terminated strings?
There are no non-null-terminated strings in C. A non-null-terminated array of characters is just an array of characters.

What is this null that terminates the string? Is it different from NULL? The "null character" is a character with the integer value of zero. (Characters are, in essence, small integers). It is sometimes, especially in the context of ASCII, referred to as NUL (single L). This is distinct from NULL (double L), which is a null pointer. The null character can be written as '\0' or just 0 in the source code. The two forms are interchangeable in C (but not in C++). The former is usually preferred because it shows the intent better.

Should I null-terminate my strings myself, or the compiler will do it for me?
If you are writing a string literal, you don't need to explicitly insert a null character in the end. The compiler will do it.

char* str1 = "a string";   // ok, \0 is inserted automatically
char* str2 = "a string\0"; // extra \0 is not needed

The compiler will not insert a null character when declaring an array with an explicit size and initialising it with a string literal with more characters than the array can hold.

char str3[5] = "hello"; // not enough space in the array for the null terminator
char str4[]  = "hello"; // ok, there is \0 in the end, the total size is 6

The compiler will not insert a null character when declaring an array and not initialising it with a string literal.

char str5[] = { 'h', 'e', 'l', 'l', 'o' };       // no null terminator
char str6[] = { 'h', 'e', 'l', 'l', 'o', '\0' }; // null terminator

If you are building a string at run-time out of some data that comes from IO or from a different part of the program, you need to make sure a null terminator is inserted. For example:

char* duplicate_string(const char* src)
{
    char* result = malloc(strlen(src) + 1); // <- reserve place for null terminator
    strcpy(dst, src);
    return dst;
}

Standard library functions such as fread and POSIX functions such as read never null-terminate their arguments. strncpy will add a null-terminator if there is enough space for it, so use it with care. Confusingly, strncat will always add a null-terminator.

Why are null-terminated strings needed?
Many functions from the standard C library, and many functions from third-party libraries, operate on strings (and all strings need to be null-terminated). If you pass a non-null-terminated character array to a function that expects a string, the results are likely to be undefined. So if you want to interoperate with the world around you, you need null-terminated strings. If you never use any standard-library or third-party functions that expect string arguments, you may do what you want.

How do I set up my code/data to handle null-terminated strings?
If you plan to store strings of length up to N, allocate N+1 characters for your data. The character needed for the null terminator is not included the length of the string, but it is included in the size of the array required to store it.

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • 4
    Your stance on "null-terminated strings" seems overly pedantic. Sure, if you only program in C, maybe the term is redundant, but for the rest of us (especially for those of us who must perform interop), the term helps distinguish C strings from other strings. And yes, there are almost certainly strings in existence that are *not* null-terminated. – Robert Harvey May 30 '22 at 15:51
  • 3
    @RobertHarvey the question is tagged "C" and "C-string". I am only talking about C. No attempt to address strings from everyone's point of view. – n. m. could be an AI May 30 '22 at 16:10
  • @RobertHarvey added a clarification – n. m. could be an AI May 30 '22 at 16:16
  • Just as it is better to write `'\0'` in C, I think it is also better to write "_NUL-terminated_" when describing C strings (instead of giving "null" multiple meanings). – Greg A. Woods Feb 23 '23 at 00:03