1

I know similar questions, like this question, have been posted and answered here but those answers don't offer me the complete picture, hence I'm posting this as a new question. Hope that is ok.

See following snippets -

char s[9] = "foobar";  //ok
s[1] = 'z'             //also ok

And

char s[9];
s = "foobar"   //doesn't work. Why? 

But see following cases -

char *s = "foobar";      //works
s[1] = 'z';              //doesn't work
char *s;
s = "foobar";            //unlike arrays, works here

It is a bit confusing. I mean I have vague understanding that we can't assign values to arrays. But we can modify it. In case of char *s, it seems we can assign values but can't modify it because it is written in read only memory. But still I can't get the full picture.

What exactly is happening at low level?

mayankkaizen
  • 327
  • 3
  • 13
  • If it was possible...imagine `char s[9]; s = "123"; s[0] = 'a';`. – Adriano Repetti May 07 '20 at 10:10
  • It might help to imagine that, for every string literal (e.g. `"foobar"`) the compiler would automatically put a definition like `const char string_literal_001[] = {'f', 'o', 'o', 'b', 'a', 'r', '\0'};` somewhere else, and what you actually did was `s = string_literal_001;` for a simple assignment or `strcpy(s, string_literal_001)` for an initialization. – Felix G May 07 '20 at 10:54

4 Answers4

3

char s[9] = "foobar"; This is initialization. An array of characters of size 9 is declared and then its contents receives the string "foobar" with any remaining characters set to '\0'.

s = "foobar" is just invalid C syntax. You cannot assign a string to a char array. To make s have the value foobar. Use strcpy(s,"foobar");

char *s = "foobar"; is also initialization, however, this assigns the address of the constant string foobar to the pointer variable s. Note that I say "constant string". A string literal is on most platforms constant. A better way of making this clear is to write const char *s = "foobar";

And indeed, your next assignment s[1]= 'z'; will not work because s is constant.

Paul Ogilvie
  • 25,048
  • 4
  • 23
  • 41
  • I guess I need to better understand what "initialization" and "assignment" mean. As for `s = "foobar"` being invalid syntax, the compiler actually complains that `error: array type 'char [9]' is not assignable s = "foobar"`. – mayankkaizen May 07 '20 at 10:25
  • 1
    Whenever you _declare_ a variable, you can _initialize_ it. That is compile-time. Any other use of `=` is _assignment_ which is run-time. – Paul Ogilvie May 07 '20 at 10:27
  • Can you or someone else please expand on "You cannot assign a string to char array." part? – mayankkaizen May 07 '20 at 10:29
  • That is helpful. I really didn't think about that part. – mayankkaizen May 07 '20 at 10:31
  • 1
    "Assigning a string to a char array" means that every char of the string must be assigned to the corresponding char of the array. All those instructions are _not_ automatically generated by the C compiler. However, to do just that, there is the library function `strcpy`. – Paul Ogilvie May 07 '20 at 10:50
1

You need to understand what the expressions are actually doing, then it might come clear to you.

  1. char s[9] = "foobar"; -> Initialize the char array s by the string literal "foobar". Correct.

  2. s[1] = 'z' -> Assign the character constant 'z' to the second elem. of char array s. Correct.

  3. char s[9]; s = "foobar"; -> Declare the char array a, then attempt to assign the string literal "foobar" to the char array. Not permissible. You can´t actually assign arrays in C, you can only initialize an array of char with a string when defining the array itself. That´s the difference. If you want to copy a string into an array of char use strcpy(s, "foobar"); instead.

  4. char *s = "foobar"; -> Define the pointer to char s and initialize it to point to the string literal "foobar". Correct.

  5. s[1] = 'z'; -> Attempt to modify the string literal "foobar", to which is s pointing to. Not permissible. A string literal is stored in read-only memory.

  6. char *s; s = "foobar"; -> Declare the pointer to char s. Then assign the pointer to point to the string literal "foobar". Correct.

1

This declares array s with an initializer:

char s[9] = "foobar";  //ok

But this is an invalid assignment expression with array s on the left:

s = "foobar";   //doesn't work. Why?

Assignment expressions and declarations with initializers are not the same thing syntactically, although they both use an = in their syntax.

The reason that the assignment to the array s doesn't work is that the array decays to a pointer to its first element in the expression, so the assignment is equivalent to:

&(s[0]) = "foobar";

The assignment expression requires an lvalue on the left hand side, but the result of the & address operator is not an lvalue. Although the array s itself is an lvalue, the expression converts it to something that isn't an lvalue. Therefore, an array cannot be used on the left hand side of an assignment expression.


For the following:

char *s = "foobar";      //works

The string literal "foobar" is stored as an anonymous array of char and as an initializer it decays to a pointer to its first element. So the above is equivalent to:

char *s = &(("foobar")[0]);      //works

The initializer has the same type as s (char *) so it is fine.

For the subsequent assignment:

s[1] = 'z';              //doesn't work

It is syntactically correct, but it violates a constraint, resulting in undefined behavior. The constraint that is being violated is that the anonymous arrays created by string literals are not modifiable. Assignment to an element of such an array is a modification and not allowed.

The subsequent assignment:

s = "foobar";            //unlike arrays, works here

is equivalent to:

s = &(("foobar")[0]);            //unlike arrays, works here

It is assigning a char * value to a variable of type char *, so it is fine.


Contrast the following use of the initializer "foobar":

char *s = "foobar";      //works

with its use in the earlier declaration:

char s[9] = "foobar";  //ok

There is a special initialization rule that allows an array of char to be initialized by a string literal optionally enclosed by braces. That initialization rule is being used to initialize char s[9].

The string literal used to initialize the array also creates an anonymous array of char (at least notionally) but there is no way to access that anonymous array of char, so it may get omitted from the output of the compiler. This is in contrast with the anonymous array of char created by the string literal used to initialize char *s which can be accessed via s.

Ian Abbott
  • 15,083
  • 19
  • 33
0

It may help to think of C as not allowing you to do anything with arrays except for assisting in a few special cases. C originated when programming languages did little more than help you move individual bytes and “words” (2 or maybe 4 bytes) around and do simple arithmetic and operations with them. With that in mind, let’s look at your examples:

char s[9] = "foobar"; //ok

This is one of the special cases: When you define an array of characters, the compiler will help you initialize it. In a definition, you may provide a string literal, which represents an array of characters, and the compiler will initialize your array with the contents of the string literal.

s[1] = 'z' //also ok

Yes, this just moves the value of one character into one array element.

char s[9]; s = "foobar" //doesn't work. Why?

This does not work because there is no assistance here. s and "foobar" are both arrays, but C has no provision for handling an array as one whole object.

However, although C does not handle an array as a whole object, it does provide some assistance for working with arrays. Since the compiler would not work with whole arrays, programmers needed some other ways to work with arrays. So C was given a feature that, when you used an array in an expression, the compiler would automatically convert it to a pointer to the first element of the array, and that would help the programmer write code to work with elements of the array. We see that in your next example:

char *s = "foobar"; //works

char *s declares s to be a pointer to char. Next, the string literal "foobar" represents an array. Above, we saw that using a string literal to initialize an array was a special case. However, here the string literal is not used to initialize an array. It is used to initialize a pointer, so the special case rules do not apply. In this case, the array represented by the string literal is automatically converted to a pointer to its first element. So s is initialized to be a pointer to the first element of the array containing “f”, “o”, “o”, “b”, “a”, “r”, and a null character.

s[1] = 'z'; //doesn't work

The arrays defined by string literals are intended to be constants. They are “read-only” in the sense that the C standard does not define what happens when you try to modify them. In many C implementations, they are assigned to memory that is read-only because the operating system and the computer hardware do not allow writing to it by normal program means. So s[1] = 'z'; may get an exception (trap) or a warning or error message from the compiler. (Ideally, char *s = "foobar"; would be disallowed because "foobar", being a constant, would have type const char [7]. However, because const did not exist in early C, the types of string literals do not have const.)

char *s; s = "foobar"; //unlike arrays, works here

Here s is a char *, and the string literal "foobar" is automatically converted to a pointer to its first element, and that pointer is a char *, so the assignment is fine.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312