I believe you ask about the reason why literals are placed in
read-only memory, not about technical details of linker doing this and
that or legal details of a standard forbidding such and such.
When modification of string literals works, it leads to subtle bugs
even in the absence of string merging (which we have reasons to
disallow if we decided to permit modification). When you see code like
char *str="Hello";
.../* some code, but str and str[...] are not modified */
printf("%s world\n", str);
it's a natural conclusion that you know what's going to be printed,
because str
(and its content) were not modified in a particular
place, between initialization and use.
However, if string literals are writable, you don't know it any
more: str[0] could be overwritten later, in this code or inside a
deeply nested function call, and when the code is run again,
char *str="Hello";
won't guarantee anything about the content of str
anymore. As we
expect, this initialization is implemented as moving the address known
in link time into a place for str
. It does not check that str
contains "Hello" and it does not allocate a new copy of it. However,
we understand this code as resetting str
to "Hello". It's hard to
overcome this natural understanding, and it's hard to reason about the
code where it is not guaranteed. When you see an expression like
x+14
, what if you had to think about 14 being possibly overwritten
in other code, so it became 42? The same problem with strings.
That's the reason to disallow modification of string literals, both in
the standard (with no requirement to detect the failure early) and in
actual target platforms (providing the bonus of detecting potential
bugs).
I believe that many attempts to explain this thing suffer from the
worst kind of circular reasoning. The standard forbids writing to
literals because the compiler can merge strings, or they can be placed
in read-only memory. They are placed in read-only memory to catch the
violation of the standard. And it's valid to merge literals because
the standard forbids... is it a kind of explanation you asked for?
Let's look at other
languages. Common Lisp standard
makes modification of literals undefined behaviour, even though the
history of preceding Lisps is very different with the history of C
implementations. That's because writable literals are logically
dangerous. Language standards and memory layouts only reflect that
fact.
Python language has exactly one place where something resembling
"writing to literals" can happen: parameter default values, and this
fact confuses people all the time.
Your question is tagged C++
, and I'm unsure of its current state
with respect to implicit conversion to non-const char*
: if it's a
conversion, is it deprecated? I expect other answers to provide a
complete enlightenment on this point. As we talk of other languages
here, let me mention plain C. Here, string literals are not const,
and an equivalent question to ask would be why can't I modify string
literals (and people with more experience ask instead, why are
string literals non-const if I can't modify them?). However, the
reasoning above is fully applicable to C, despite this difference.