1

I have a question about the work of computer memory. I have tried to get the answer on my own, but I cannot figure out how it works exactly. So, imagine a situation when a person declares a pointer to a future string; however, the initialization will come a bit later:

char *str; 

After that, he or she wants to declare another variable.

char her;  

Right after that, both variables are initialized and their addresses and values are printed to STDOUT. The whole program looks like this:

int     main(void)
{
    char *str;
    char her;

    her = 'Y';
    str = "HelloMuraMana";
    printf("%p\n", (void *)&str);
    printf("%p\n", (void *)&(str[1]));
    printf("%p\n", (void *)&(str[2]));
    printf("%p\n", (void *)&(str[3]));
    printf("%p\n", (void *)&(str[4]));
    printf("%p\n", (void *)&her);
    return (0);
}    

Now, my question: how the computer allocated memory for both variables (especially the string characters). I would like to also add a picture what my macOS machine showed me as a result:

CLICK HERE to see the results

edit:
I am specifically interested in the way the memory works here. Also, please note that str[0] has one address, and str1, str[2], str[3], and str[4] have other, not contiguous with the first element addresses.

kjioyh
  • 13
  • 5
  • Way too broad. Ask the compiler/linker to generate a map file for you and take a look. – Eugene Sh. Jul 17 '17 at 16:02
  • This might be helpful: [What and where are the stack and heap?](https://stackoverflow.com/questions/79923/what-and-where-are-the-stack-and-heap?rq=1) –  Jul 17 '17 at 16:04
  • 1
    Don't post links to images - include the results in the text of your question. Details of memory allocation vary with architecture and compiler; you'll really need to narrow down your question to get a useful answer. – John Bode Jul 17 '17 at 16:05
  • Read about [linkers](https://en.wikipedia.org/wiki/Linker_(computing)) – Basile Starynkevitch Jul 17 '17 at 16:21
  • "Separately declared and initialized variables" don't exist. Variables can only be initialized upon declaration. Any other statement is not an initialization. – Andrew Henle Jul 17 '17 at 16:39

3 Answers3

0

Conceptually when a program is loaded in memory it has 3 areas (segments):

  • code segment: the text of your program is stored here (it is a read-only area)
  • data segment: contains any global or static variables which have a pre-defined value and can be modified
  • stack segment: here are loaded the functions as they are called. The set of values (a stack frame) pushed on the stack for every function call which contains the return address off the function and the local variables.

In your case the char her variable is a local variable, in the main() function, which is initialized with the value Y . Thus, it is stored on the stack and can be modified.

The char *str variable is a pointer which points to the address of "HelloMuraMana\0", a constant located in the code segment. Being a read-only area, you cannot modify it's contents.

0

Like I said in my comment, the details of how things are materialized in memory will vary based on platform. But, here's a very high-level view that should give some flavor of how things work.

First, in the context of the C programming language, objects can have one of several storage durations: static, automatic, allocated, and thread local. Objects with static storage duration are allocated as soon as the program starts and released when program exits. Objects with automatic storage duration are allocated upon entry to their enclosing scope (function or block), and are released as soon as that scope exits. Objects with allocated storage duration are allocated with a call to malloc/calloc/realloc, and are released with a call to free. I'm not going to get into thread local because it's not really relevant to this discussion.

When your program is loaded into memory, it's laid out something like this (assuming x86 or similar):

              +------------------------+
high address  | Command line arguments |   
              | and environment vars   |  
              +------------------------+
              |         stack          |  <-- str and her live here, but
              | - - - - - - - - - - -  |      only for the duration of main()
              |           |            |
              |           V            |
              |                        |
              |           ^            |
              |           |            |
              | - - - - - - - - - - -  |
              |          heap          |
              +------------------------+
              |    global and read-    | <-- "HelloMuraMana" lives here for 
              |       only data        |     duration of the program
              +------------------------+
              |     program text       |
 low address  |    (machine code)      |
              +------------------------+   

The exact picture will depend on your system. Note that this is how things look in the virtual address space, not physical memory.

As your program runs, storage for auto variables (function arguments and variables local to a function or block like her and str) is allocated from the region labeled stack. Storage for allocated objects is allocated from the region labeled heap.

Storage for static objects as well as storage for string literals like "HelloMuraMana" is taken from a different segment; in the picture above, it'll be the segment labeled global and read-only data. Depending on your system, string literals may be stored in a read-only segment (such as .rodata or .rdata) or they may be in a writable segment. String literals are supposed to be immutable (hence the term "literal"), so attempting to modify a string literal will result in undefined behavior.

In the layout above, global data objects will have lower addresses than stack or heap objects, which is shown by your output. The variable str is allocated from the stack when you enter main; it's value is the address of the string literal, which is allocated from the global data segment when the program first starts.

John Bode
  • 119,563
  • 19
  • 122
  • 198
  • Thank you for you very detailed answer. However, my main question still remains intact - how is it possible for the first element of the array to have not contiguous address with the rest of the array (being allocated in the stack)? I thought that the whole array should be located in contiguous cells of memory. – kjioyh Jul 21 '17 at 15:51
  • @kjioyh: You didn't print the address of the first element of the array (`&str[0]`) - you printed the address of the *pointer variable* `str`, which is a different object from the array. – John Bode Jul 21 '17 at 15:54
-1

You should pick a slightly simpler example:

int     main(void)
{
    char *str;
    char her;

    her = 'Y';
    str = "HelloMuraMana";
    puts(str);
    putchar(her);
    return (0);
}    

and compile it into assembly and inspect that:

.LC0:
        .string "HelloMuraMana"
main:
        pushq   %rbp                     
        movq    %rsp, %rbp               //frame setup
        subq    $16, %rsp                //allocate 2*8bit words for the 2 variables
        movb    $89, -1(%rbp)            //initialize her
        movq    $.LC0, -16(%rbp)         //initialize str
        movq    -16(%rbp), %rax          //prepare the argument to puts
        movq    %rax, %rdi
        call    puts                     //self-descriptive
        movsbl  -1(%rbp), %eax           //prepare the argument to putchar
        movl    %eax, %edi
        call    putchar                  //self-descriptive
        movl    $0, %eax                 //prepare the main return value
        leave
        ret

Now, as you can see from the disassembly, the two stack variables are allocated by a stack pointer subtraction (subq $16, %rsp) (because stacks usually grow downwards). The string literal char array is a static-lifetime variable, which goes into into a segment that the program loader will allocate when the program loads.

Petr Skocik
  • 58,047
  • 6
  • 95
  • 142