0

I try to google this topic, but no one can explain clear. I try the below code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char * argv[]){

    char * p1 = "dddddd";
    const char * p2 = "dddddd";
    char p3[] = "dddddd";
    char * p4 =(char*)malloc(sizeof("dddddd")+1);
    strcpy(p4, "dddddd");
    //*(p1+2) = 'b'; // test_1
    //Output >>  Bus error: 10

    // *(p2+2) = 'b'; // test_2
    // Output >> char_point.c:11:13: error: read-only variable is not assignable
    *(p3+2) = 'b'; // test_3
    // Output >>
    //d
    //dddddd
    //dddddd
    //ddbddd

    *(p4+2) = 'k'; // test_4
    // Output >>
    //d
    //dddddd
    //dddddd
    //ddbddd
    //ddkddd

    printf("%c\n", *(p1+2));
    printf("%s\n", p1);
    printf("%s\n", p2);
    printf("%s\n", p3);
    printf("%s\n", p4);

    return 0;
}

I have try 3 tests, but only the test_3 and test_4 can pass. I know const char *p2 is read only, because it's a constant value! but i don't know why p1 can't be modified! which section of memory it's layout? BTW, I compile it on my Mac with GCC.

I try to compile it to dis-asm it by gcc -S, I got this.

.section    __TEXT,__text,regular,pure_instructions
.macosx_version_min 10, 13
.globl  _main
.p2align    4, 0x90
_main:                                  ## @main
    .cfi_startproc
## BB#0:
    pushq   %rbp
Lcfi0:
    .cfi_def_cfa_offset 16
Lcfi1:
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
Lcfi2:
    .cfi_def_cfa_register %rbp
    subq    $48, %rsp
    movl    $8, %eax
    movl    %eax, %ecx
    leaq    L_.str(%rip), %rdx
    movl    $0, -4(%rbp)
    movl    %edi, -8(%rbp)
    movq    %rsi, -16(%rbp)
    movq    %rdx, -24(%rbp)
    movq    %rdx, -32(%rbp)
    movl    L_main.p3(%rip), %eax
    movl    %eax, -39(%rbp)
    movw    L_main.p3+4(%rip), %r8w
    movw    %r8w, -35(%rbp)
    movb    L_main.p3+6(%rip), %r9b
    movb    %r9b, -33(%rbp)
    movq    %rcx, %rdi
    callq   _malloc
    xorl    %r10d, %r10d
    movq    %rax, -48(%rbp)
    movl    %r10d, %eax
    addq    $48, %rsp
    popq    %rbp
    retq
    .cfi_endproc

    .section    __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
    .asciz  "dddddd"

L_main.p3:                              ## @main.p3
    .asciz  "dddddd"


.subsections_via_symbols

I want to know every pointer what i declaration, which section is it?

Frank AK
  • 1,705
  • 15
  • 28
  • 2
    Not to put too fine a point on it, it is undefined behavior to modify a string literal in C. – ad absurdum Jan 31 '18 at 01:54
  • But why the other `test_3` and `test_4` can be modify? – Frank AK Jan 31 '18 at 01:57
  • 1
    `p3[]` is an array of `char`, which you _can_ modify (here, `char p3[] = "dddddd";` _initializes_ `p3` with the contents of the string literal, but `p3` itself is not a string literal). `p4` points to a dynamic allocation of `char`, which you can also modify. Note that on some old systems string literals _could_ be modified, but the C Standard specifies that this is undefined behavior. – ad absurdum Jan 31 '18 at 02:00
  • 1
    `sizeof("dddddd")` is basically the same as `sizeof(char*)`, not `strlen("dddddd")` – nemequ Jan 31 '18 at 02:03
  • `p1` is pointing to read-only memory, even if you didn't use `const`. All string literals are read-only. – Pablo Jan 31 '18 at 02:09
  • But this answer no clearly why and which section the memory layout . @Pablo – Frank AK Jan 31 '18 at 02:10
  • @FrankAK what? I don't understand you – Pablo Jan 31 '18 at 02:11
  • @Pablo I have update my question with some asm code, which output from `gcc -S`, i want to know the layout of each pointer at the memory ! like my asm code show. – Frank AK Jan 31 '18 at 02:27

3 Answers3

2

"Why p1 can't be modified?"

Roughly speaking, p1 points to a string literal, and attempts to modify string literals cause undefined behavior in C.

More specifically, according to the §6.4.5 6 of the C11 Standard, string literals are:

used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char....

Concerning objects with static storage duration, §5.1.2 1 states that

All objects with static storage duration shall be initialized (set to their initial values) before program startup. The manner and timing of such initialization are otherwise unspecified.

"Which section of memory it's layout?"

But, the Standard does not specify any specific memory layouts that an implementation must follow.

What the Standard does say about the arrays of char which are created from string literals is that (§6.4.5 7):

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

Community
  • 1
  • 1
ad absurdum
  • 19,498
  • 5
  • 37
  • 60
1

So

char * p1 = "dddddd";

this should be

const char * p1 = "dddddd";

String literals (the ones in quotes) reside in read-only memory. Even if you don't use the const keyword in the declaration of the variable, p1 still points to read-only memory. So

*(p1+2) = 'b'; // test_1

is going to fail.

Here

*(p2+2) = 'b'; // test_2
// Output >> char_point.c:11:13: error: read-only variable is not assignable

the compiler tells you, you cannot do that because you declared p2 as const. The difference between the first test and this one, is that the code tries to modify a character and fails.

Now this:

char * p4 =(char*)malloc(sizeof("dddddd")+1);

First, do not cast malloc & friends. Second: the sizeof-operator returns the number of bytes needed to store the expression in memory. "ddddd" is a string literal, it returns a pointer to char, so sizeof("dddddd") returns the number of bytes that a pointer to char needs to be stored in memory.

The correct function would be strlen:

char * p4 = malloc(strlen("dddddd")+1);

Note that in this case

char txt[] = "Hello world";
printf("%lu\n", sizeof(txt));

will print 12 and not 11. C strings are '\0'-terminated, that means that txt holds all these characters plus the '\0'-terminating byte. In this case sizeof doesn't return the number of bytes for a pointer, because txt is an array.

void foo(char *txt)
{
    printf("%lu\n", sizeof(txt));
}

void bar(void)
{
    char txt[] = "Hello world";
    foo(txt);
}

Here you won't get 12 like before, most probably 8 (today's common size for a pointer). Even though txt in bar is an array, the txt in foo is a pointer.

Pablo
  • 13,271
  • 4
  • 39
  • 59
0

Arrays are constant pointer, which means that an array points to a memory address and you cant change were it points. But you can change the elements in it.

While you can change where the pointer points, but it's elements are constant.

for example consider this code

int main(){
int a[] = {1,2,3};
int * ptr = {1,2,3};

//a[0] == *(a+0)
//a[1] == *(a+1)

a += 1; // this is wrong, because we cant change were array points
ptr += 1; // this is correct, now the pointer ptr will points to the next element which is 2

a[0] += 2 // this is correct, now a[0] will become 3
*ptr += 2 // this is wrong, because we cant change the elements of the pointer.
return 0;
    }
Hamza Jadid
  • 388
  • 4
  • 17