10

What is the difference between char* and int*? Sure, they are of different types, but how is it that I can write

char* s1="hello world";

as

"hello world"

it is not a one character, it's an array of characters, and I cannot write

*s1

as

char* s1 = {'h','e','l','l','o',' ','w','o','r','l','d'};

and

int* a = {2,3,1,45,6};

What is the difference?

Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
  • 1
    Special treatment of string literals, i.e. the `".."` thing. In theory, I think you could allow it as well for other kinds of initializations (à la C99 compound literals). – dyp Jul 21 '14 at 20:18
  • 3
    Very related: [What is going on in these five different ways of declaring a cString?](http://stackoverflow.com/questions/24615859/what-is-going-on-in-these-five-different-ways-of-declaring-a-cstring) – Mooing Duck Jul 21 '14 at 20:20
  • the `{...}` thing actually is an object with a type, it's an `initializer_list` – user2485710 Jul 21 '14 at 20:20
  • 1
    @user2485710 No, the `{..}` is not an object, not even an expression, and does not have a type. It's a *braced-init-list*. – dyp Jul 21 '14 at 20:21
  • @dyp it's a template basically, yes it has a type. – user2485710 Jul 21 '14 at 20:22
  • 1
    @user2485710 See, for example, http://stackoverflow.com/q/18009628/420683 A braced-init-list can be used to construct a `initializer_list` in certain contexts, but it is a more general construct. – dyp Jul 21 '14 at 20:23
  • You can do what you want, with slightly different syntax: char s1[]={'h','e','l','l','o',' ','w','o','r','l','d'} int a[]={2,3,1,45,6} But I can't explain why one is allowed and the other is not. – Moby Disk Jul 21 '14 at 20:26
  • As I thought, you can do this kind of stuff in C (>=C99) with compound literals. For example, `int *a = (int[]){1, 2, 3, 4};` – dyp Jul 21 '14 at 20:26
  • @MobyDisk: Check my answer: The difference is that the array brings its own memory, the pointer does not, so there is no memory where the char/int list could be stored. – gexicide Jul 21 '14 at 20:33
  • 1
    Actually, your first line isn't allowed either. C++ used to support non-const pointers to string literals for compatibility with C, but no longer. Use **`const`** `char *s1 = "hello world";` instead. – Ben Voigt Jul 21 '14 at 20:36
  • @dyp: What's the lifetime of the array created from a C99 compound literal? Temporary? Automatic in the local scope? The [GCC docs](https://gcc.gnu.org/onlinedocs/gcc-3.2/gcc/Compound-Literals.html) don't say :( – Ben Voigt Jul 21 '14 at 20:39
  • @BenVoigt IIRC, static at global scope, and automatic inside block scope. Will look it up. C11 6.5.2.5/5, rules are as I remembered them. – dyp Jul 21 '14 at 20:40

2 Answers2

8

It is quite simple: A string literal, i.e., "foobar" is compiled to an array of chars which is stored in the static section of your program (i.e., where all constants are stored) and null terminated. Then, assigning this to a variable simply assigns a pointer to this memory to the variable. E.g., const char* a = "foo"; will assign the address where "foo" is stored to a.

In short, a string constant already brings the memory where it is to be stored with it.

In contrast, initializing a pointer with an initializer list, (i.e., a list of elements inside curly braces) is not defined for pointers. Informally, the problem with an initializer list -- in contrast to a string literal -- is that it does not "bring its own memory". Therefore, we must provide memory where the initializer list can store its chars. This is done by declaring an array instead of a pointer. This compiles fine:

char s1[11]={'h','e','l','l','o',' ','w','o','r','l','d'}

Now, we provided the space where the chars are to be stored by declaring s1 as an array.

Note that you can use brace initialization of pointers, though, e.g.:

char* c2 = {nullptr};

However, while the syntax seems equal, this something completely different which is called uniform initialization and will simply initialize c2 with nullptr.

gexicide
  • 38,535
  • 21
  • 92
  • 152
  • I think you could make `int *a = {1,2,3,4};` work similarly to compound literals. However, a string literal always has static storage duration; so there's not much benefit of using `int *a = {1,2,3,4};` instead of `int a[] = {1,2,3,4};` especially inside block scope (where it would need to have automatic storage duration due to mutability). OTOH, `int const a* = {1,2,3,4};` could have static storage duration again. I don't think you could easily extend that to class types, though. – dyp Jul 21 '14 at 20:35
  • 1
    *"In short, a string constant already brings the memory where it is to be stored with it."* As opposed to other literals, we don't store a string literal *by value*. Maybe because that would cost more memory? Or because you can't keep them in registers / encode them into assembly instructions? I think this leads to the *requirement* of being able to store them as pointers. – dyp Jul 21 '14 at 20:40
  • The second part of this answer is just wrong, brace-init-list can be used with pointers as I demonstrate in my answer and the whole concept of bringing its own memory does not make sense. The result of the assignment fail because they violate a *shall* constraint. – Shafik Yaghmour Jul 22 '14 at 02:05
  • @ShafikYaghmour: Of course you can use uniform initialization syntax on pointers. But here we are talking about initializing a pointer to point with an array of elements, which is also mentioned "(i.e., a **list of elements** inside curly braces)" and you can't do that with pointers. So although the syntax is similar, the semantics are completely different. – gexicide Jul 22 '14 at 08:13
  • @ShafikYaghmour: "and the whole concept of bringing its own memory does not make sense." Why? Although it is quite informal, it does make perfect sense and it is the reason why it is not possible. Of course *the spec* does not say *why* something was specified the way it is, it simply uses *X shall...Y*. But everything in the spec happens for a reason, and this is it in this case. – gexicide Jul 22 '14 at 08:16
3

In your first case, the string literal is decaying to a pointer to a const char. Although s1 really should be const char *, several compiler allow the other form as an extension:

const char* s1 = "hello world" ;

A sting literal is an array of const char, we can see this from the draft C++ standard section 2.14.5 String literals which says (emphasis mine going forward):

Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration (3.7).

The conversion of an array to pointer is covered in section 4.2 Array-to-pointer conversion which says:

[...] an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue.[...]

Your other cases do not work because a scalar which can be an arithmetic type, enumeration types or a pointer type can only be initialized with a single element inside braces this is covered in the draft C++ standard section 5.17 Assignment and compound assignment operators 8.5.1 List-initialization paragraph 3 which says:

List-initialization of an object or reference of type T is defined as follows:

and then enumerates the different cases the only that applies to the right hand side for this case is the following bullet:

Otherwise, if the initializer list has a single element of type E and either T is not a reference type or its referenced type is reference-related to E, the object or reference is initialized from that element; if a narrowing conversion (see below) is required to convert the element to T, the program is ill-formed.

which requires the list to have a single element, otherwise the final bullet applies:

Otherwise, the program is ill-formed.

In your two cases even if you reduced the initializer to one variable, the types are incorrect h is a char and 2 is an int which won't convert to a pointer.

The assignment could be made to work by assigning the results to an array such as the following:

  char s1[] = { 'h', 'e', 'l', 'l', 'o',' ', 'w', 'o', 'r', 'l', 'd' } ;
  int  a[]  = { 2, 3, 1, 45, 6 } ;

This would be covered in section 8.5.1 Aggregates which says:

An array of unknown size initialized with a brace-enclosed initializer-list containing n initializer-clauses, where n shall be greater than zero, is defined as having n elements (8.3.4). [ Example:

int x[] = { 1, 3, 5 };

declares and initializes x as a one-dimensional array that has three elements since no size was specified and there are three initializers. —end example ] An empty initializer list {} shall not be used as the initializer-clause for an array of unknown bound.104

Note:

It is incorrect to say that a brace-init-list is not defined for pointers, it is perfectly usable for pointers:

int x   = 10 ;
int *ip =  &x ;
int *a  = {nullptr} ;
int *b  = {ip} ;
Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740