-1

I have problems understanding how char* works.

In the example below, the struses is called by main(). I created a buf to store the const variable because I want to make a modifiable copy of s1, then I just call the sortString().

This version makes sense to me as I understand that char[] can be modified:

#include "common.h"
#include <stdbool.h>
void sortString(char string[50]);

bool struses(const char *s1, const char *s2) 
{

    char buf[50];
    strcpy(buf, s1);  // <===== input = "perpetuity";
    sortString(buf);
    printf("%s\n", buf); // prints "eeipprttuy"
    return true;
}

void sortString(char string[50]) 
{
    char temp;
    int n = strlen(string);
    for (int i = 0; i < n - 1; i++)
    {
        for (int j = i + 1; j < n; j++)
        {
            if (string[i] > string[j])
            {
                temp = string[i];
                string[i] = string[j];
                string[j] = temp;
            }
        }
    }
}

However, in this version I deliberately changed the type to char* which is supposed to be read-only. Why do I still get the same result?

#include "common.h"
#include <stdbool.h>
void sortString(char *string);

bool struses(const char *s1, const char *s2)
{

    char buf[50];
    strcpy(buf, s1); 
    sortString(buf);
    printf("%s\n", buf);
    return true;
}

void sortString(char *string)  // <==== changed the type
{
    char temp;
    int n = strlen(string);
    for (int i = 0; i < n - 1; i++)
    {
        for (int j = i + 1; j < n; j++)
        {
            if (string[i] > string[j])
            {
                temp = string[i];
                string[i] = string[j];
                string[j] = temp;
            }
        }
    }
}

This is why I think char * is read only. I get a bus error after trying to to modify read[0]:

char * read = "Hello";
read[0]='B';// <=== Bus error
printf("%s\n", read); 
David
  • 373
  • 3
  • 21
  • 1
    In a **function argument** `char x[N]` is 100% identical to `char *x`. Your `void sortString(char string[50]) { /*...*/ }` is 100% equal to `void sortString(char *string) { /*...*/ }` – pmg Nov 22 '21 at 17:48
  • 2
    Why do you think that `char*` is read-only? – 0x5453 Nov 22 '21 at 17:48
  • [What is array to pointer decay?](https://stackoverflow.com/a/1461449) – 001 Nov 22 '21 at 17:50
  • Updated the post for why I think char * is read only – David Nov 22 '21 at 17:51
  • 2
    I think you are thinking of string literals: `char *p = "Hello";` – 001 Nov 22 '21 at 17:52
  • 1
    `char *string` is not read-only. `const char *string` would be. – Barmar Nov 22 '21 at 17:52
  • Did your last code fragment really print "Hello"?? I would have expected it to either print "Bello", or to crash and print nothing, except perhaps the words "Segmentation fault" or "Bus error" or "[BSOD](https://en.wikipedia.org/wiki/Blue_screen_of_death)". :-) – Steve Summit Nov 22 '21 at 17:55
  • @SteveSummit had problem with my makeFile, should be bus error – David Nov 22 '21 at 17:58
  • But the conclusion is still the same, char* is not modifiable in this case – David Nov 22 '21 at 17:58
  • The bus error surely comes from the assignment, not the `pritnf` – ikegami Nov 22 '21 at 18:05
  • 1
    Stack Overflow is not supposed to be interactive technical support, but you have now changed what it says about crashing/not crashing multiple times, changing the applicability of various answers. Code that attempts to modify a string literal could crash because the string literal is in read-only memory or could change the memory because the string literal is in modifiable memory or could neither crash nor change the memory because the compiler optimizes the code using the assumption that the contents of the string literal are always the same… – Eric Postpischil Nov 22 '21 at 19:55
  • … Those options might be interesting in a question about the designs and implementations of various C implementations, but they are largely irrelevant to your issue of whether a `char *` is, by itself, modifiable. And answering why your particular C implementation does what it does would require knowing the compiler version you are using and the switches you are using to compile. – Eric Postpischil Nov 22 '21 at 19:57

5 Answers5

2

The compiler adjusts the type of the parameter having an array type of this function declaration

void sortString(char string[50]);

to pointer to the element type

void sortString(char *string);

So for example these function declarations are equivalent and declare the same one function

void sortString(char string[100]);
void sortString(char string[50]);
void sortString(char string[]);
void sortString(char *string);

Within this function

void sortString(char *string)

there is used the character array buf that stores the copy of the passed array (or of the passed string literal through a pointer to it)

char buf[50];
strcpy(buf, s1);
sortString(buf);

So there is no problem. s1 can be a pointer to a string literal. But the content of the string literal is copied in the character array buf that is being changed

As for this code snippet

char * read = "Hello";
read[0]='B';
printf("%s\n", read); <=== still prints "Hello"

then it has undefined behavior because you may not change a string literal.

From the C Standard (6.4.5 String literals)

7 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

Pay attention to that in C++ opposite to C string literals have types of constant character arrays. It is advisable also in C to declare pointers to string literals with the qualifier const to avoid undefined behavior as for example

const char * read = "Hello";

By the way the function sortString has redundant swappings of elements in the passed string. It is better to declare and define it the following way

// Selection sort
char * sortString( char *s ) 
{
    for ( size_t i = 0, n = strlen( s ); i != n; i++ )
    {
        size_t min_i = i;

        for ( size_t j = i + 1; j != n; j++ )
        {
            if ( s[j] < s[min_i] )
            {
                min_i = j;
            }
        }

        if ( i != min_i )
        {
            char c = s[i];
            s[i] = s[min_i];
            s[min_i] = c;
        }
    }

    return s;
}
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
2

char * does not mean read-only. char * simply means pointer to char.

You have likely been taught that string literals, such as "Hello", may not be modified. That is not quite true; a correct statement is that the C standard does not define what happens when you attempt to modify a string literal, and C implementations commonly place string literals in read-only memory.

We can define objects with the const qualifier to say we intend not to modify them and to allow the compiler to place them in read-only memory (although it is not obligated to). If we were defining the C language from scratch, we would specify that string literals are const-qualified, the pointers that come from string literals would be const char *.

However, when C was first developed, there was no const, and string literals produced pointers that were just char *. The const qualifier came later, and it is too late the change string literals to be const-qualified because of all the old code using char *.

Because of this, it is possible that a char * points to characters in a string literal that should not be modified (because the behavior is not defined). But char * in general does not mean read-only.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
2

Your premise that the area pointed by a char* isn't modifiable is false. This is perfectly line:

char s[] = "abc";       // Same as: char s[4] = { 'a', 'b', 'c', 0 };
char *p = s;            // Same as: char *p = &(s[0]);
*p = 'A';
printf("%s\n", p);      // Abc

Demo

The reason you had a fault is because you tried to modify the string created by a string literal. This is undefined behaviour:

char *p = "abc";
*p = 'A';               // Undefined behaviour
printf("%s\n", p);

One would normally use a const char * for such strings.

const char *p = "abc";
*p = 'A';               // Compilation error.
printf("%s\n", p);

Demo

ikegami
  • 367,544
  • 15
  • 269
  • 518
1

Regarding

char * read = "Hello";
read[0]='B';
printf("%s\n", read);   // still prints "Hello"

you have tripped over a backward compatibility wart in the C specification.

String constants are read-only. char *, however, is a pointer to modifiable data. The type of a string constant ought to be const char [N] where N is the number of chars given by the contents of the constant, plus one. However, const did not exist in the original C language (prior to C89). So there was, in 1989, a whole lot of code that used char * to point to string constants. So the C committee made the type of string constants be char [N], even though they are read-only, to keep that code working.

Writing through a char * that points to a string constant triggers undefined behavior; anything can happen. I would have expected a crash, but the write getting discarded is not terribly surprising either.

In C++ the type of string constants is in fact const char [N] and the above fragment would have failed to compile. Some C compilers have an optional mode you can turn on that changes the type of string constants to const char [N]; for instance, GCC and clang have the -Wwrite-strings command line option. Using this mode for new programs is a good idea.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • OP edited their post. There is no now assertion that any code that attempts to modify a string literal crashes. – Eric Postpischil Nov 22 '21 at 18:07
  • @EricPostpischil The version of the post that I saw, asserted that attempts to modify a string literal were silently ignored. This appears to also be what it says now. Crashing was _my_ expectation. – zwol Nov 22 '21 at 19:28
1

Yout long examples can be reduced to your last question.

This is why I think char * is read only, get bus error after attempt to modify read[0]

char * read = "Hello";
read[0]='B';
printf("%s\n", read); <=== Bus error

"Hello" is a string literal . Attempt to modify the string literal manifested itself by the Bus Error.

Your pointer is referencing the memory which should not be modified.

How to sort it out? You need to define pointer referencing the modifiable object

char * read = (char []){"Hello"};
read[0]='B';
printf("%s\n", read); 

So as you see declaring it as modifiable is not making it modifiable.

0___________
  • 60,014
  • 4
  • 34
  • 74