-3

To create a string that I can modify, I can do something like this:

// Creates a variable string via array
char string2[] = "Hello";
string2[0] = 'a'; // this is ok

And to create a constant string that cannot be modified:

// Creates a constant string via a pointer
char *string1 = "Hello";
string1[0] = 'a'; // This will give a bus error

My question then is how would one modify a constant string (for example, by casting)? And, is that considered bad practice, or is it something that is commonly done in C programming?

Rob
  • 14,746
  • 28
  • 47
  • 65
  • If I'm not mistaken, `string2` initialized to "Hello" is `const`. For sure not `const` if initialized through `strcpy` or `strdup`. – 1737973 Sep 06 '19 at 22:00
  • @john You are mistaken. `string2` is not const. – Christian Gibbons Sep 06 '19 at 22:01
  • Shared, you cannot do what you are proposing. String literals are immutable. – Christian Gibbons Sep 06 '19 at 22:02
  • @john: `char string2[] = "Hello";` initializes `string2` (in the C model of computing) by copying the contents of `"Hello"` into `string2`. This does not make `string2` `const`. – Eric Postpischil Sep 06 '19 at 22:02
  • Ugh. I think I meant `"Hello"`. `"Hello"` is `const`, isn't it? – 1737973 Sep 06 '19 at 22:03
  • XY problem? What are you trying to accomplish with this or is this a language lawyer idle thought? I used self-modifying 6502 assembly code for faster line drawing a long time ago but in a modern OS constant program and data areas are not writable. – Dave S Sep 06 '19 at 22:05
  • @Shared: You should not “modify a constant string (for example, by casting).” While the C standard does not prohibit you from attempting this (it is a **voluntary** standard), it does not define the behavior of attempting to do so. Regardless of casting, attempting to write a value to any element of a string literal has behavior not defined by the C standard. – Eric Postpischil Sep 06 '19 at 22:07
  • What's the meaning of the word "constant" if you can modify it anyway? – S.S. Anne Sep 06 '19 at 23:10

4 Answers4

4

By definition, you cannot modify a constant. If you want to get the same effect, make a non-constant copy of the constant and modify that.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • thanks for this. Could you please show a code example of what you mean by `make a non-constant copy of the constant and modify that` ? –  Sep 06 '19 at 22:05
  • 2
    It can be as simple as `string1 = strdup(string1); string1[0] = 'a';`. Don't forget to `free` it when you're done. – David Schwartz Sep 06 '19 at 22:06
  • Or if you don't want to use dynamic memory, and if you don't mind fixed-size buffers and the possibility of overflow, simply `char string2[20]; strcpy(string2, string1); string2[0] = 'a';`. – Steve Summit Sep 06 '19 at 22:31
  • @SteveSummit It would be better to use `strncpy` there. – S.S. Anne Sep 06 '19 at 23:15
  • 1
    @JL2210 Not really. If you're not worried about buffer overflow, `strcpy` is fine. If you are worried about buffer overflow, you really want to use something else. `strncpy` is basically useless. (Yes, you *can* use it, but it's just too much trouble.) – Steve Summit Sep 06 '19 at 23:20
2

You cannot modify the contents of a string literal in a safe or reliable manner in C; it results in undefined behavior. From the C11 standard draft section 6.4.5 p7 concerning string literals:

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

Christian Gibbons
  • 4,272
  • 1
  • 16
  • 29
  • 2
    Sometimes you can modify string literals. The C standard merely says **it** does not define the behavior. It does not say you **cannot** or that an implementation is required to stop you. People should understand this, in case, among other reasons, they are ever in a situation where a string literal is modified (incorrectly), and they need to figure out what happened in order to debug the program. – Eric Postpischil Sep 06 '19 at 22:09
  • should be: cannot "safely" or "reliably" perhaps? – Dave S Sep 06 '19 at 22:10
  • @EricPostpischil Would it be better if I stressed that it cannot be *legally* done? – Christian Gibbons Sep 06 '19 at 22:10
  • @ChristianGibbons: It is not illegal; nothing in the C standard prohibits you from trying or prohibits an implementation from allowing it—or even supporting it as a defined extension. It is simply not defined by the standard. It is correct to say it is not portable or that it is not reliable without some guarantee from the C implementation. – Eric Postpischil Sep 06 '19 at 22:13
  • @EricPostpischil Do you not consider undefined behavior to be illegal? Considering that undefined behavior invalidates the entire program, I'd consider it to be illegal. – Christian Gibbons Sep 06 '19 at 22:14
  • @ChristianGibbons: Absolutely undefined behavior is not illegal. Almost all commercial programs use behavior not defined by the C standard—they use services from the operating system (not defined by the standard), communicate with devices (not defined by the standard), use compiler extensions (not defined by the standard), link with non-C libraries (not defined by the C standard), and so on. All you can do in a strictly conforming C program is abstract computing. – Eric Postpischil Sep 06 '19 at 22:16
  • Actually, I would say that undefined behavior is not illegal *by definition*. Illegal behavior is stuff the compiler is obligated to warn you about. Undefined behavior is, by definition, the stuff the compiler is not obligated to warn you about. Yes, you should do everything you can to avoid most kinds of undefined behavior, but that's because of your own programming guidelines, not because of what the C Standard says is or isn't legal. – Steve Summit Sep 06 '19 at 22:25
  • @DaveS I believe your wording choice may have been better. Updated. – Christian Gibbons Sep 06 '19 at 22:30
  • @ChristianGibbons: The C standard is somewhat like a city that offers bus service, a water supply, public streets, and so on as long as you are in the city. It is in no way illegal to leave the city, but they do not offer any services outside the city. – Eric Postpischil Sep 06 '19 at 23:06
2

how would one modify a constant string (for example, by casting)?

If by this you mean, how would one attempt to modify it, you don't even need a cast. Your sample code was:

char *string1 = "Hello";
string1[0] = 'a';         // This will give a bus error

If I compile and run it, I get a bus error, as expected, and just like you did. But if I compile with -fwritable-strings, which causes the compiler to put string constants in read/write memory, it works just fine.

I suspect you were thinking of a slightly different case. If you write

const char *string1 = "Hello";
string1[0] = 'a';         // This will give a compilation error

the situation changes: you can't even compile the code. You don't get a Bus Error at run-time, you get a fatal error along the lines of "read-only variable is not assignable" at compile time.

Having written the code this way, one can attempt to get around the const-ness with an explicit cast:

((char *)string1)[0] = 'a';

Now the code compiles, and we're back to getting a Bus Error. (Or, with -fwritable-strings, it works again.)

is that considered bad practice, or is it something that is commonly done in C programming

I would say it is considered bad practice, and it is not something that is commonly done.

I'm still not sure quite what you're asking, though, or if I've answered your question. There's often confusion in this area, because there are typically two different kinds of "constness" that we're worried about:

  1. whether an object is stored in read-only memory

  2. whether a variable is not supposed to be modified, due to the constraints of a program's architecture

The first of these is enforced by the OS and by the MMU hardware. It doesn't matter what programming-language constructs you did or didn't use -- if you attempt to write to a readonly location, it's going to fail.

The second of these has everything to do with software engineering and programming style. If a piece of code promises not to modify something, that promise may let you make useful guarantees about the rest of the program. For example, the strlen function promises not to modify the string you hand it; all it does is inspect the string in order to compute its length.

Confusingly, in C at least, the const keyword has mostly to do with the second category. When you declare something as const, it doesn't necessarily (and in fact generally does not) cause the compiler to put the something into read-only memory. All it does is let the compiler give you warnings and errors if you break your promise -- if you accidentally attempt to modify something that elsewhere you declared as const. (And because it's a compile-time thing, you can also readily "cheat" and turn off this kind of constness with a cast.)

But there is read-only memory, and these days, compilers typically do put string constants there, even though (equally confusingly, but for historical reasons) string constants do not have the type const char [] in C. But since read-only memory is a hardware thing, you can't "turn it off" with a cast.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • I don't think the immutability of string literals has anything to do with some kind of read-only hardware. I think it has to do with the values simply put in a memory location that isn't supposed to be written to during execution of the program. The .rodata section of an ELF, for example. – Christian Gibbons Sep 06 '19 at 22:39
  • bus error? I would expect a segmentation fault. I also think `-fwritable-strings` was removed from GCC. – S.S. Anne Sep 06 '19 at 23:11
  • @ChristianGibbons My wording was a little vague, but the point is that it's typically the MMU hardware that actually detects the illegal write. But you're right, it's not hardware ROM. – Steve Summit Sep 06 '19 at 23:17
  • @JL2210 I don't even know what to expect any more. "Bus error" is what the OP reported, and yes, it's also (specifically: "Bus error: 10") what I got. – Steve Summit Sep 06 '19 at 23:18
0

Attempting to modify constant string literal is undefined behavior. You may get a bus error, as in your case, or the program may not even indicate that the write failed at all. This is undefined behavior for you - the language makes no promises at this point.

You could reassign the pointer (losing your reference to the string "Hello"):

char *s1 = "Hello";
printf("%s ", s1);
s1 = "World";
printf("%s\n", s1);
Govind Parmar
  • 20,656
  • 7
  • 53
  • 85
  • Sometimes you can modify string literals. The C standard merely says **it** does not define the behavior. It does not say you **cannot** or that an implementation is required to stop you. People should understand this, in case, among other reasons, they are ever in a situation where a string literal is modified (incorrectly), and they need to figure out what happened in order to debug the program. – Eric Postpischil Sep 06 '19 at 22:09
  • Or in other words, "You may get a bus error, or the program may not even indicate that the write failed, or the write may succeed." Once upon a time, string literals typically *were* modifiable, and even today, under many compilers you can still request this legacy behavior with `-fwritable-strings` if you want to. – Steve Summit Sep 06 '19 at 22:29