0

I was playing around with char arrays in c++ and wrote this program:

int main()
{

char text[] = { 'h', 'e', 'l', 'l', 'o' };  //arrays initialised like this 
                                            //will have a size of the number 
                                            //of elements that you see

char text2[] = "hello"; //arrays initialised like this will have a size of 
                        //the number of elements that you see + 1 (0 on the 
                        //end to show where the end is

cout << endl;

cout << "The size of the first array is: " << sizeof(text) << endl;

cout << endl;

for (int i = 0; i < sizeof(text); i++)
{
    cout << i << ":" << text[i] << endl;
}
cout << endl;

cout << "The size of the first array is: " << sizeof(text2) << endl;

cout << endl;

for (int i = 0; i < sizeof(text2); i++)
{
    cout << i << ":" << text2[i] << endl;
}
cout << endl;

cin.get();

return 0;
}

This program gives me the output:

The size of the first array is: 5

0:h
1:e
2:l
3:l
4:o

The size of the first array is: 6

0:h
1:e
2:l
3:l
4:o
5:

My question is: Is there a particular reason that initializing a char array with separate chars will not have a null terminator (0) on the end unlike initializing a char array with a string literal?

Drise
  • 4,310
  • 5
  • 41
  • 66
Chris Gray
  • 11
  • 2
  • 2
    it would be rather annoying if each `char` array had a null implicitly added, while for string literals thats just what you want – 463035818_is_not_an_ai Apr 06 '18 at 15:12
  • 3
    It is just the way the language works. When you take control and specify what you want (`{ 'h', 'e', 'l', 'l', 'o' }`), that is what you get. – NathanOliver Apr 06 '18 at 15:12
  • 2
    Nice observation! I guess the answer is, "What if I actually want an array of `char`s that isn't a string? How could I get that otherwise?" – BoBTFish Apr 06 '18 at 15:13
  • 1
    see also https://stackoverflow.com/a/17943529/1132334 – Cee McSharpface Apr 06 '18 at 15:13
  • 2
    Because sometimes you want an array of bytes instead of "characters"? It really depends on the use-case, so the compiler cant make any assumptions. – Some programmer dude Apr 06 '18 at 15:13
  • maybe what causes your confusing is that not each `char` array is used to store character sequences. `char` is basically just a type like `int` or `float` that can hold some values. Being used as a string is just one usecase, though a very common one – 463035818_is_not_an_ai Apr 06 '18 at 15:14
  • Odd duplicate that Community spotted, no? That did not mention the explicit char array. – Bathsheba Apr 06 '18 at 15:22
  • @Bathsheba The answer did, albeit maybe not as directly as you like: https://stackoverflow.com/a/40821770/2757035 – underscore_d Apr 06 '18 at 15:33
  • @underscore_d: Odd policy that. I could create a question "what is the C++ standard", answer it with a verbatim copy of the C++ standard, and close *every* C++ question to that answer. For me a duplicate has to be "the question is an exact duplicate of this question". Disk is cheap. – Bathsheba Apr 06 '18 at 15:35

6 Answers6

4

A curly braces initializer just provides the specified values for an array (or if the array is larger, the rest of the items are defaulted). It's not a string even if the items are char values. char is just the smallest integer type.

A string literal denotes a zero-terminated sequence of values.

That's all.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
Cheers and hth. - Alf
  • 142,714
  • 15
  • 209
  • 331
1

Informally, it's the second quotation character in a string literal of the form "foo" that adds the NUL-terminator.

In C++, "foo" is a const char[4] type, which decays to a const char* in certain situations.

It's just how the language works, that's all. And it's very useful since it dovetales nicely with all the standard library functions that model a string as a pointer to the first element in a NUL-terminated array of chars.

Splicing in an extra element with something like char text[] = { 'h', 'e', 'l', 'l', 'o' }; would be really annoying and it could introduce inconsistency into the language. Would you do the same thing for signed char, and unsigned char, for example? And what about int8_t?

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
1

You can terminate it yourself in multiple ways:

char text1[6] = { 'h', 'e', 'l', 'l', 'o' };
char text2[sizeof "hello"] = { 'h', 'e', 'l', 'l', 'o' };
char text3[] = "hello"; // <--- my personal favourite
Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
1

A string literal like for example this "hello" has a type of a constant character array and initializwd the following way

const char string_literal_hello[] = { 'h', 'e', 'l', 'l', 'o', '\0' };

As it is seen the type of the string literal is const char[6]. It contains six characters.

Thus this declaration

char text2[] = "hello"; 

that can be also written like

char text2[] = { "hello" }; 

in fact is substituted for the following declaration

char text2[] = { 'h', 'e', 'l', 'l', 'o', '\0' };

That is then a string literal is used as an initializer of a character array all its characters are used to initialize the array.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
1

Is there a particular reason that initializing a char array with separate chars will not have a null terminator (0)

The reason is because that syntax...

Type name[] = { comma separated list };

...is used for initializing arrays of any type. Not just char.

The "quoted string" syntax is shorthand for a very specific type of array that assumes a null terminator is desired.

Drew Dormann
  • 59,987
  • 13
  • 123
  • 180
0

When you designate a double quote delimited set of adjacent characters (a string literal), it is assumed that what you want is a string. And a string in C means an array of characters that is null-terminated, because that's what the functions that operate on strings (printf, strcpy, etc...) expect. So the compiler automatically adds that null terminator for you.

When you provide a brace delimited, comma separated list of single quote delimited characters, it is assumed that you don't want a string, but you want an array of the exact characters you specified. So no null terminator is added.

C++ inherits this behavior.

Benjamin Lindley
  • 101,917
  • 9
  • 204
  • 274
  • 1
    Note though that in C, `"foo"` is a `char[4]` type although it's UB to try to modify it. Also note that `'h'` is an `int` type in C. In other words, the languages diverge so much in this area, I avoided making the comparison. – Bathsheba Apr 06 '18 at 15:21