Why do we need a null terminator in C++ strings?

Question

I'm new to programming and very new to C++, and I recently came across strings.

Why do we need a null terminator at the end of a character list?

I've read answers like since we might not use all the spaces of an array therefore we need the null terminator for the program to know where the string ends e.g. char[100] = "John" but why can't the program just loop through the array to check how many spaces are filled and hence decide the length?

And if only four characters are filled in the array for the word "John", what are the others spaces filled with?

@S.M. Close but no duplicate IMO. That other question asked about null terminated as an alternative for length prefixed. — Thomas, Aug 28 '20 at 07:52
There is answer given by the author of C, Dennis M Ritchie, below that question. — 273K, Aug 28 '20 at 07:55
Answering the "why can't the program check how many spaces are filled" question: what exactly "check how many spaces are filled" is supposed to mean? Memory is **always** filled with something. It's like asking "check if this box has something inside". Well, a box always has something inside, i.e. the air. And 0 in C++ plays the role of the air. — freakish, Aug 28 '20 at 07:56

Jan Schultke · Accepted Answer · 2021-03-25T16:38:21.167

The other characters in the array char john[100] = "John" would be filled with zeros, which are all null-terminators. In general, when you initialize an array and don't provide enough elements to fill it up, the remaining elements are default-initialized:

int foo[3] {5};           // this is {5, 0, 0}
int bar[3] {};            // this is {0, 0, 0}

char john[5] = "John";    // this is {'J', 'o', 'h', 'n', 0}
char peter[5] = "Peter";  // ERROR, initializer string too long
                          // (one null-terminator is mandatory)

Also see cppreference on Array initialization. To find the length of such a string, we just loop through the characters until we find 0 and exit.

The motivation behind null-terminating strings in C++ is to ensure compatibility with C-libraries, which use null-terminated strings. Also see What's the rationale for null terminated strings?

Containers like std::string don't require the string to be null-terminated and can even store a string containing null-characters. This is because they store the size of the string separately. However, the characters of a std::string are often null-terminated anyways so that std::string::c_str() doesn't require a modification of the underlying array.

C++-only libraries will rarely -if ever- pass C-strings between functions.

Thank you very much for the answer. However, if 0 is the null terminator, then what happens when there is a 0 element in a int array? — NewbieToCoding, Aug 29 '20 at 08:48
@NewbieToCoding `int` is not a `char` and is not something that you can construct a string from. You would use `int` for numeric applications instead. — Jan Schultke, Aug 29 '20 at 10:59
@NewbieToCoding yes, in ASCII and UTF charsets, `0` represents the `NULL` character, which signals the end of the string. — Jan Schultke, Aug 29 '20 at 14:42

Aleksander Bobiński · Answer 2 · 2020-08-28T09:58:18.643

0

The existance of a null terminator is a design decision. The purpose it serves is marking the end of the string. There are other ways to do this, for example in Pascal the first element of a string is it's size so no null terminator is needed.

In the example you give only the first 5 elements of the array will be initialized, the rest are zero initialized. Notice how I said 5 elements and not just four. The fifth element is the null terminator.

Sure the program can loop through the string to find out it's length but how will it know when to stop looping?

edited Aug 28 '20 at 09:58

answered Aug 28 '20 at 07:58

Aleksander Bobiński

315
3
11

1

*"the rest have garbage values in them"*. No, the rest are default initialized which for numeric types means initialized to zero. See [Array initialization](https://en.cppreference.com/w/c/language/array_initialization). – Jan Schultke Aug 28 '20 at 09:37

score 0 · Answer 3 · answered Aug 28 '20 at 13:58

The nul terminator is what tells you what spaces are filled. Everything up to and including the nul terminator has been filled. Everything after it has not.

There is no general notion of which elements of an array have been filled. An array holds some number of elements; its size is determined when it is created. All of its elements have some value initially; there's no way, in general, to determine which ones have been assigned a value and which ones have not from looking at the values of the elements.

Strings are arrays of char and a coding convention that the "end" of the string is marked by a nul character. Most of the string manipulation functions rely on this convention.

A string literal, such as "John", is an array of char. "John" has 5 elements in the array: 'J', 'o', 'h', 'n', '\0'. The function strcpy, for example, copies characters until it sees that nul terminator:

char result[100]; // no meaningful values here
strcpy(result, "John");

After the call to strcpy, the first five elements of result are 'J', 'o', 'h', 'n', and '\0'. The rest of the array elements have no meaningful values.

I would be remiss if I didn't mention that this style of string comes from C, and is often referred to as C-style strings. C++ supports all of the C string stuff, but it also has a more sophisticated notion of a string, std::string, which is completely different. In general, you should be using C++-style strings and not C-style strings.

Why do we need a null terminator in C++ strings?

3 Answers3