5

Is it possible to create a modifiable string literal in C++? For example:

char* foo[] = {
    "foo",
    "foo"
};
char* afoo = foo[0];
afoo[2] = 'g'; // access violation

This produces an access violation because the "foo"s are allocated in read only memory (.rdata section I believe). Is there any way to force the "foo"s into writable memory (.data section)? Even via a pragma would be acceptable! (Visual Studio compiler)

I know I can do strdup and a number of other things to get around the problem, but I want to know specifically if I can do as I have asked. :)

Anne
  • 53
  • 1
  • 3
  • 1
    No, according to the Standard modifying a string literal evoked undefined behavior. The data itself is `const`, even if you have a non-`const` pointer to it. If you do this you risk having problems from run to run, or when you change or even patch your compiler. – John Dibling Jun 16 '10 at 16:51
  • It doesn't make sense to mutate a string **literal**. Maybe it's more clear with int: how do I make `5 = 7` compile? – fredoverflow Jun 16 '10 at 16:59
  • Anne, this is not something you should do, but it is possible in practice, against what C++ standard says. I wrote an article explaining how to do this in UNIX-like systems that have memory protection interface exposed to user-space programs, [check it out here](http://lazarenko.me/2013/05/01/how-constant-is-a-constant/). –  May 02 '13 at 01:16
  • See also: [c - Why do I get a segmentation fault when writing to a "char *s" initialized with a string literal, but not "char s\[\]"? - Stack Overflow](https://stackoverflow.com/questions/164194/why-do-i-get-a-segmentation-fault-when-writing-to-a-char-s-initialized-with-a) ■ Link broken, use web archive https://web.archive.org/web/20130806024514/http://lazarenko.me/2013/05/01/how-constant-is-a-constant ■ Side note, attempting to modify it as above might give troubles because the strings conflict/have same address.) – user202729 Jun 18 '22 at 10:50

6 Answers6

9

Since this is C++, the "best" answer would be to use a string class (std::string, QString, CString, etc. depending on your environment).

To answer your question directly, you're not supposed to modify string literals. The standard says this is undefined behavior. You really do need to duplicate the string one way or another, otherwise you're writing incorrect C++.

Cogwheel
  • 22,781
  • 4
  • 49
  • 67
  • 2
    To elaborate, if you're using a framework that has its own string class, you're just going to make your life more difficult by trying to fit `std::string` into the mix. Hence "depending on your environment" – Cogwheel Jun 16 '10 at 17:00
5

I think the closest you can come is to initialize a plain char[] (not a char*[]) with a literal:

char foo[] = "foo";

That'll still perform a copy at some point though.

The only other way around that would be to use system level calls to mark the page that a string literal resides in as writeable. At that point you're not really talking about C or C++, you're really talking about Windows (or whatever system you're running on). It's probably possible on most systems (unless the data is really in ROM, which might be the case on an embedded system for example), but I sure don't know the details.

Oh, and don't forget that in your example:

char* foo[] = {
    "foo",
    "foo"
};

Since the standard (C99 6.4.5/6 "String literals") says:

It is unspecified whether these arrays are distinct provided their elements have the appropriate values.

There's no certainty about whether the 2 pointers in that array will point to the same or separate objects. Nearly all compilers will have those pointers point to the same object at the same address, but they don't have to and some more complicated situations of pointers to string literals might have the compiler coming up with 2 separate identical strings.

You could even have a scenario where one string literal exists 'inside' another:

char* p1 = "some string";
char* p2 = "string";

p2 may well be pointing at the tail end of the string pointed to by p1.

So if you start changing string literals by some hack you can perform on a system, you may end up modifying some 'other' strings unintentionally. That's one of the things that undefined behavior can bring along.

Michael Burr
  • 333,147
  • 50
  • 533
  • 760
2

If you store your string in an array you can change it.

There's no way to 'correctly' write to read-only memory.

You could, of course, stop using C-strings.

Edward Strange
  • 40,307
  • 7
  • 73
  • 125
  • Correctly is subjective. Correctly according to what? According to hardware, there is a pretty correct and straightforward way to do this, considering there is no such thing as read-only memory in pretty much all of the commodity hardware, really. –  May 02 '13 at 01:19
1

I would not do this. Therefore, I can only provide a nasty ugly hack you could try out: Get the page where your constant literal resides and unprotect that page. See VirtualProtect() function for Win32. However, even if this works, it will not guarantee the correct behavior all the time. Better don't do it.

zerm
  • 2,812
  • 25
  • 17
1

You could create a multidimensional array of chars:

#include <iostream>

int main(int argc, char** argv)
{
    char foo[][4] = {
        "foo",
        "bar"
    };
    char* afoo = foo[0];
    afoo[2] = 'g';
    std::cout << afoo << std::endl;
}

More verbose way to define the array:

char foo[][4] = {
    {'f', 'o', 'o', '\0'},
    {'b', 'a', 'r', '\0'}
};
Ferdinand Beyer
  • 64,979
  • 15
  • 154
  • 145
  • 2
    You can still use string literals, you don't have to specify all the chars as char literals. – Edward Strange Jun 16 '10 at 16:31
  • 2
    Also, Anne should keep in mind that if she does it this way the 4 has to be the size of the longest string +1. Using a string class is really the better option but this is as close to what she seems to want as she'll get. – Edward Strange Jun 16 '10 at 16:40
  • I marked this as correct because it answers what I am trying to do (I realise I shouldn't be trying to do it but, heh). Why it works is another matter... – Anne Jun 16 '10 at 16:47
  • Doing this evokes undefined behavior. You cannot modify string literals. – John Dibling Jun 16 '10 at 16:48
  • It actually doesn't do what you're trying to do. It's still copying the string, but that fact is obscured by the array initialization syntax. – Cogwheel Jun 16 '10 at 16:53
  • 1
    @John: The literal is just used as a short notation to initialize the `char[4]`. Not the literal is modified, but the `char[2][4]` array on the stack, which is perfectly valid. – Ferdinand Beyer Jun 16 '10 at 16:53
  • @Cogwheel: That depends on the compiler. The assembly code generated by GCC uses one `movl` command to copy the 4-byte string on the stack for the literal version, whereas it needs 4 `movb` commands for the verbose version. – Ferdinand Beyer Jun 16 '10 at 17:06
  • Ok, I'm a bit confused. This solution does enable me to update the elements of foo, so it solves my problem. However, in my real application, there are hundreds of elements in foo. So, am I right in thinking (from these comments) that the "foo" and "bar" literals are created, plus the array? (The array is global in my application so it won't be created on the stack, meaning that the solution is using twice as much memory as I'd hoped.) – Anne Jun 16 '10 at 17:17
  • 1
    That will probably depend on the compiler. As Ferdinand pointed out, GCC emits code that directly loads the values into the array rather than copying from a string literal. This means having extra storage for the literal would be unnecessary unless you initialize more than one array with the same literals. – Cogwheel Jun 16 '10 at 17:22
-3

Yes.

   (char[]){"foo"}
John
  • 1