2

I am trying to examine the use of data and text segments in memory via a simple program, named source1.cpp:

int main()
{
    const char* b="Hello everyone!";
    int a=100;
    return 0;
}

I generated the assembly by issuing gcc -S source1.cpp, and here is the output:

    .file   "source1.cpp"
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $48, %rsp
    movq    %fs:40, %rax
    movq    %rax, -8(%rbp)
    xorl    %eax, %eax
    movabsq $8531260732055774536, %rax
    movq    %rax, -32(%rbp)
    movabsq $9400199222489701, %rax
    movq    %rax, -24(%rbp)
    movl    $100, -36(%rbp)
    movl    $0, %eax
    movq    -8(%rbp), %rdx
    xorq    %fs:40, %rdx
    je  .L3
    call    __stack_chk_fail
.L3:
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.2) 5.4.0 20160609"
    .section    .note.GNU-stack,"",@progbits

Could anyone tell me how to figure out the text and data segments, or documentation that might help me in this?

boxofchalk1
  • 493
  • 1
  • 6
  • 13
  • 2
    There is only a `text` segment in this code. You can ignore all the lines starting with `.cfi` and the last 3 lines. `Hello everyone!` is placed onto the stack with `movabsq $8531260732055774536, %rax movq %rax, -32(%rbp) movabsq $9400199222489701, %rax movq %rax, -24(%rbp)` . You may wish to review the optimized code (compile with `-O3`) – Michael Petch Nov 28 '16 at 04:57
  • 2
    @MichaelPetch I think I have a fundamental misunderstanding about this...I had thought `Hello everyone!` and `100` would be placed in the data segment, as both are *initialized data*. Secondly, if the assembly code never contains the data segment, where does the loader find the data segment to run the program? – boxofchalk1 Nov 28 '16 at 05:04
  • 1
    Constant strings often go into the `.rodata` section, but it can be placed on the stack since it is a local variable - that is up to the compiler. `int a=100;` inside the function is a local variable so will be on the stack. – Michael Petch Nov 28 '16 at 05:11
  • 1
    @MichaelPetch Thanks! I got the .rodata section when I made the string much longer. Out of curiosity, are the numbers `8531260732055774536` `9400199222489701` in ASCII encoding? I found that the first argument of `movabsq` is an immediate. – boxofchalk1 Nov 28 '16 at 05:34
  • 1
    Correct, they are ASCII encoded and because the CPU is little endian the characters are encoded backwards. Convert those values to HEX and it becomes clearer. – Michael Petch Nov 28 '16 at 05:35
  • 3
    If you had enabled optimization (and looked at a function that returned a pointer to a string literal so it wouldn't optimize away), you would have seen gcc put that string in `.rodata`. Note that the `.text` section and `.rodata` section both go in the text *segment* after linking: http://stackoverflow.com/questions/14361248/whats-the-difference-of-section-and-segment-in-elf-file-format – Peter Cordes Nov 28 '16 at 07:29

0 Answers0