30

While coding a simple function to remove a particular character from a string, I fell on this strange issue:

void str_remove_chars( char *str, char to_remove)
{
    if(str && to_remove)
    {
       char *ptr = str;
       char *cur = str;
       while(*ptr != '\0')
       {
           if(*ptr != to_remove)
           {
               if(ptr != cur)
               {
                   cur[0] = ptr[0];
               }
               cur++;
           }
           ptr++;
       }
       cur[0] = '\0';
    }
}
int main()
{
    setbuf(stdout, NULL);
    {
        char test[] = "string test"; // stack allocation?
        printf("Test: %s\n", test);
        str_remove_chars(test, ' '); // works
        printf("After: %s\n",test);
    }
    {
        char *test = "string test";  // non-writable?
        printf("Test: %s\n", test);
        str_remove_chars(test, ' '); // crash!!
        printf("After: %s\n",test);
    }

    return 0;
}

What I don't get is why the second test fails? To me it looks like the first notation char *ptr = "string"; is equivalent to this one: char ptr[] = "string";.

Isn't it the case?

codaddict
  • 445,704
  • 82
  • 492
  • 529
Gui13
  • 12,993
  • 17
  • 57
  • 104
  • Very good article about this topic: http://eli.thegreenplace.net/2009/10/21/are-pointers-and-arrays-equivalent-in-c/ – jyz Apr 10 '13 at 17:03
  • Read: [Difference between `char *str` and `char str[]` and how both stores in memory?](http://stackoverflow.com/questions/15177420/what-does-sizeofarray-return/15177499#15177499) – Grijesh Chauhan Sep 06 '13 at 05:36

6 Answers6

52

The two declarations are not the same.

char ptr[] = "string"; declares a char array of size 7 and initializes it with the characters
s ,t,r,i,n,g and \0. You are allowed to modify the contents of this array.

char *ptr = "string"; declares ptr as a char pointer and initializes it with address of string literal "string" which is read-only. Modifying a string literal is an undefined behavior. What you saw(seg fault) is one manifestation of the undefined behavior.

codaddict
  • 445,704
  • 82
  • 492
  • 529
  • And a sizeof(ptr) will give different results too for the different declarations. The first one will return the length of the array including the terminating null character. The second will return the length of a pointer, usually 4 or 8. – Prof. Falken Oct 05 '10 at 12:53
  • It's also true in the second place that the contents of ptr can be changed. But the contents are the pointer to the literal, not the characters. – Darron Oct 05 '10 at 13:10
  • 3
    +1, great answer. It is also true and important to understand that with `char *ptr = "string";` the `ptr` can be pointed at something else and can therefor be 'changed' in what it is pointing at but the characters `"string"` is a literal and cannot change. – dawg Oct 05 '10 at 16:07
  • It would also be worth mentioning the performance issues. Declaring an initialized automatic array variable will fill the entire array contents every time the variable comes into scope. Declaring an initialized automatic pointer variable will simply assign the pointer (a single word write) when the variable comes into scope. If the string is long or the block is entered often (like each iteration of a loop), the difference could be very significant! – R.. GitHub STOP HELPING ICE Oct 06 '10 at 06:27
  • @AmigableClarkKant, actually, `sizeof(ptr)` is not the length of the array unless `ptr` is declared as a char array. If `ptr` is defined as an int arryay with 3 elements, `sizeof(ptr)` will return the sum of `sizeof(int)` of each element. – 爱国者 Jan 24 '12 at 08:20
  • @爱国者 Yes, patriot I know. :-) I use that property extensively in my own code. – Prof. Falken Jan 24 '12 at 08:48
6

Strictly speaking a declaration of char *ptr only guarantees you a pointer to the character type. It is not unusual for the string to form part of the code segment of the compiled application which would be set read-only by some operating systems. The problem lies in the fact that you are making an assumption about the nature of the pre-defined string (that it is writeable) when, in fact, you never explicitly created memory for that string yourself. It is possible that some implementations of compiler and operating system will allow you to do what you've attempted to do.

On the other hand the declaration of char test[], by definition, actually allocates readable-and-writeable memory for the entire array of characters on the stack in this case.

PP.
  • 10,764
  • 7
  • 45
  • 59
3

As far as I remember

char ptr[] = "string";

creates a copy of "string" on the stack, so this one is mutable.

The form

char *ptr = "string";

is just backwards compatibility for

const char *ptr = "string";

and you are not allowed (in terms of undefined behavior) to modify it's content. The compiler may place such strings in a read only section of memory.

DerKuchen
  • 1,840
  • 1
  • 15
  • 15
3

char *test = "string test"; is wrong, it should have been const char*. This code compiles just because of backward comptability reasons. The memory pointed by const char* is a read-only memory and whenever you try to write to it, it will invoke undefined behavior. On the other hand char test[] = "string test" creates a writable character array on stack. This like any other regualr local variable to which you can write.

Naveen
  • 74,600
  • 47
  • 176
  • 233
  • 1
    I wouldn't go so far as to say it's wrong. You might want to later have `test` point to a modifiable string, and keep a flag (in another variable) indicating whether it's been replaced with something modifiable. Still, in most cases it's probably good practice to use `const` there. – R.. GitHub STOP HELPING ICE Oct 06 '10 at 06:24
0
char *str = strdup("test");
str[0] = 'r';

is proper code and creates a mutable string. str is assigned a memory in the heap, the value 'test' filled in it.

Ravi Chandra
  • 677
  • 12
  • 24
0

Good answer @codaddict.

Also, a sizeof(ptr) will give different results for the different declarations.

The first one, the array declaration, will return the length of the array including the terminating null character.

The second one, char* ptr = "a long text..."; will return the length of a pointer, usually 4 or 8.

Prof. Falken
  • 24,226
  • 19
  • 100
  • 173