0

I am new to C and I want perform this task: declare and initialize a string and then reassign each string element to a new value.

Writing the code in this way:

char *str = "geeksforgeeks\0";

for (int i = 0; str[i] != '\0'; ++i) {
    str[i] = 'a';
}

throws a segmentation fault.

But if I write the code in this manner:

char string[] = "geeksforgeeks\0";
char *str = string;

for (int i = 0; str[i] != '\0'; ++i) {
    str[i] = 'a';
}

the program behaves correctly.

Also this code:

char str[] = "geeksforgeeks\0";

for (int i = 0; str[i] != '\0'; ++i) {
    str[i] = 'a';
}

behaves correctly.

What is the difference between the two? Should't be equivalent?

Alex
  • 971
  • 1
  • 9
  • 23

4 Answers4

6

char *str = "geeksforgeeks\0";

This string is allocated in readonly* memory and you can't modify it. Also the null terminator there is redundant.

Same is not the case with the array you defined, that is why it works. In the case with array the string literal is copied to memory where array resides - and you can modify contents of that array. So using this

char *str = string;

you point to the first element of the array - which as mentioned, is modifiable (as well as all elements of the array).

*It can be they are stored not in read only memory, depends on platform. But anyway you are not allowed to modify them.

Giorgi Moniava
  • 27,046
  • 9
  • 53
  • 90
  • But from what I know this should be a read-only string: "const char *str = "geeksforgeeks\0"; – Alex Dec 01 '15 at 19:40
  • @Alex: It is but it gets copied to array and you can modify the array (2nd case). – Giorgi Moniava Dec 01 '15 at 19:41
  • I think I am starting to understand. – Alex Dec 01 '15 at 19:53
  • @Georgi But if in the second case it gets copied to another memory location from where you can access it using an array, then we have two copies of the same string in memory: the string literal from the read-only zone and the copy that you can access it using an array. Is that true? – Alex Dec 01 '15 at 20:02
  • @Alex: I am not entirely sure. btw. I didn't mean in the second case - when assigned to array, it is copied from read only memory. It might be in second case it is not stored in read only memory at all, and directly stored in array. Point is in second case, you are just modifying array content. Please correct that in your question too. – Giorgi Moniava Dec 01 '15 at 20:10
  • @Georgi: Thanks, I have corrected it. – Alex Dec 01 '15 at 20:16
  • @Alex Just think of it in a way that in second case, you are really modifying array elements - which is allowed. – Giorgi Moniava Dec 01 '15 at 20:18
6

If you have:

char *str = "geeksforgeeks\0";

the string is (usually) stored in read-only memory and you get a segmentation fault when you try to modify it. (The \0 is really not needed; you have two null bytes at the end of the string.)

The simplest fix is to use an array instead of a constant string (which is basically what you do in the second working case):

char str[] = "geeksforgeeks";

Note that you should really use this for the string since the string is not modifiable:

const char *str = "geeksforgeeks";
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Nit - string literals don't *have* to be stored in read-only memory. They are on common platforms, but the standard does not require it. Attempting to modify a string literal invokes undefined behavior. – John Bode Dec 01 '15 at 19:50
2

The reason is simple.

In first example, you have a pointer to an static string. that's why you get a segmentation fault.

char *str = "Test";

This is practically a constant string. But in 2nd example, it is a variable that you change.

// You have a variable here
char str_array[] = "Test";
// Now you have a pointer to str_array
char *str = str_array;
Afshin
  • 8,839
  • 1
  • 18
  • 53
1

You’ve hit on a bit of ugly legacy baggage. When you write the literal "geeksforgeeks\0", the compiler turns that into a pointer to an array of characters. If you later use the string "geeksforgeeks\0" again, it’s allowed to point both references to the same array. This only works if you can’t modify the array; otherwise, fputs(stdout, "geeksforgeeks\0"); would be printing aeeksforgeeks. (Fortran can top this: on at least one compiler, you could pass the constant 1 by name to a function, set it equal to -1, and all your loops would then run backwards.) On the other hand, the C standard doesn’t say that modifying string literals won’t work, and there’s some old code that did. It’s undefined behavior.

When you allocate an array to hold the string, you’re creating a unique copy, and that can be modified without causing errors elsewhere.

So why aren’t string literals const char * instead of char *? Early versions of C didn’t have the const keyword, and the standards committee didn’t want to break that much old code. However, you can and should declare pointers to string literals as const char* s = "geeksforgeeks\0"; so the compiler will stop you from shooting yourself in the foot.

Davislor
  • 14,674
  • 2
  • 34
  • 49
  • so they try to preserve the memory? If I create 10000 of "geeksforgeeks" basically there is only one copy in memory and the 10000 variables are pointing to the same thing? I am coming from Java and there you have something similar: strings are immutable objects, however, if you try to modify one of them you do not get segmentation fault, but you get a another fresh new immutable string. – Alex Dec 01 '15 at 20:11
  • 1
    The compiler is allowed, but not required, to make every copy of the string literal point to the same place. C is too low-level a language to do that kind of copy on modification. The OS might be able to do that for pages of zeroes on a hardware level. You pretty much get all the disadvantages of both with none of the advantages of either: you can’t be sure memory will be shared or not, or that comparing "Hello" == "Hello" will work or not, or that it can or can’t be modified. But you also aren’t guaranteed to get a warning you’re doing something unsafe. – Davislor Dec 01 '15 at 20:56
  • This is where C gets really crazy...if you make a change instead of expecting two straightforward outcomes: works/does not work, there are three outcomes: works/does not work/undefined behavior. – Alex Dec 01 '15 at 21:07
  • Well, every non-trivial, portable language has undefined behavior. In this case, it’s cruft. C could have specified to begin with that modifying a string literal does not work, but K&R didn’t forbid it at first, and you gave a good example of how programmers did it and expected it to work. So there’s code out in the wild that does, and also compilers that won’t let you, and the language standard allows both. – Davislor Dec 01 '15 at 22:01