I was reading a book on C today, and it mentioned that the following was true; I was so curious as to why that I made this program to verify; and then ultimately post it here so someone smarter than me can teach me why these two cases are different at runtime.
The specifics of the question related to the difference at runtime between how a (char *) is handled based on whether it is pointing to a string created as a literal vs. created with malloc and manual population.
why is the memory allocated by the memory more protected like this? Also, does the answer explain the meaning of "bus error"?
Here is a program I wrote which asks the user if they would like to crash or not, to illustrate that the program compiles fine; and to highlight that in my head the code in both options is conceptually identical; but that's why I'm here, to understand why they are not.
// demonstrate the difference between initializing a (char *)
// with a literal, vs malloc
// and the mutability of the contents thereafter
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
int main() {
char cause_crash;
char *myString;
printf("Cause crash? ");
scanf("%c", &cause_crash);
if(cause_crash == 'y') {
myString = "ab";
printf("%s\n", myString); // ab
*myString = 'x'; // CRASH!
printf("%s\n", myString);
} else {
myString = malloc(3 * sizeof(char));
myString[0] = 'a';
myString[1] = 'b';
myString[2] = '\0';
printf("%s\n", myString); // ab
*myString = 'x';
printf("%s\n", myString); // xb
}
return 0;
}
edit: conclusions
There are several good answers below, but I want to summarize what I have come to understand succinctly here.
The basic answer seems to be this:
When a compiler sees a "string literal" being assigned to a (char *) variable, the pointer will point to memory which is static (perhaps actually part of the binary, but usually enforced as read only by a lower-level system than your runtime. In other words, the memory is probably not dynamically allocated at that part of the program, but instead the pointer is simply set to point to an area of static memory which houses the contents of your literal.
There are a few things I want to call out about this resolution:
1. Optimization may be a possible motive: With my compiler, two different (char *) variables initialized with the same string literal actually point to the same address:
char *myString = "hello";
char *mySecond = "hello"; // the pointers are identical! This is a cool optimization.
2 Interstingly, if the variable is actually an array of chars (instead of a (char *)), this (#1) is not true. this was interesting to me because I was under the impression that (post-compilation) arrays where identical to pointers-to-chars.
char myArString[] = "hello";
char myArSecond[] = "hello"; // the pointers are NOT the same
3 to summarize what several answers hinted at: char *myString = "Hello, World!"
does not allocate new memory, it just sets myString to point to memory which already existed; perhaps in the binary, perhaps in a special read-only block of memory... etc.
4 I found through testing that char myString[] = "Hello, World!"
does allocate new memory; I think... what I know is that the string is mutable when created this way.