0

I was searching about the behaviour of array allocation in C. I know in details that the allocation of a simple array like int array[10] runs in compilation time. I'm trying to disassemble a simple code to know what happens in the backstage when i ask to allocate some array (inside a scope like main) with size defined by another variable that will be read sometime before the array declaration. I'm using a code like this:

#include <stdio.h>

int main(){
  int n; scanf("%d", &n);
  int arr[n], int x[10];  
  return 0;
}

Here, the array x is just to see the difference in the .s file. I'm using a OS X and typing the command gcc -c test.c to get the file test.c and using the command objdump -d test.o to see the disassemble. Here is the output:

teste.o:    file format Mach-O 64-bit x86-64

Disassembly of section __TEXT,__text:
_main:
       0:   55  pushq   %rbp
       1:   48 89 e5    movq    %rsp, %rbp
       4:   48 83 ec 60     subq    $96, %rsp
       8:   48 8d 3d 88 00 00 00    leaq    136(%rip), %rdi
       f:   48 8d 75 c8     leaq    -56(%rbp), %rsi
      13:   48 8b 05 00 00 00 00    movq    (%rip), %rax
      1a:   48 8b 00    movq    (%rax), %rax
      1d:   48 89 45 f8     movq    %rax, -8(%rbp)
      21:   c7 45 cc 00 00 00 00    movl    $0, -52(%rbp)
      28:   b0 00   movb    $0, %al
      2a:   e8 00 00 00 00  callq   0 <_main+0x2F>
      2f:   b9 04 00 00 00  movl    $4, %ecx
      34:   89 cf   movl    %ecx, %edi
      36:   48 89 e6    movq    %rsp, %rsi
      39:   48 89 75 c0     movq    %rsi, -64(%rbp)
      3d:   89 45 b4    movl    %eax, -76(%rbp)
      40:   e8 00 00 00 00  callq   0 <_main+0x45>
      45:   48 8d 3d 4e 00 00 00    leaq    78(%rip), %rdi
      4c:   be 05 00 00 00  movl    $5, %esi
      51:   48 89 45 b8     movq    %rax, -72(%rbp)
      55:   b0 00   movb    $0, %al
      57:   e8 00 00 00 00  callq   0 <_main+0x5C>
      5c:   c7 45 cc 00 00 00 00    movl    $0, -52(%rbp)
      63:   48 8b 7d c0     movq    -64(%rbp), %rdi
      67:   48 89 fc    movq    %rdi, %rsp
      6a:   8b 4d cc    movl    -52(%rbp), %ecx
      6d:   48 8b 3d 00 00 00 00    movq    (%rip), %rdi
      74:   48 8b 3f    movq    (%rdi), %rdi
      77:   48 8b 55 f8     movq    -8(%rbp), %rdx
      7b:   48 39 d7    cmpq    %rdx, %rdi
      7e:   89 45 b0    movl    %eax, -80(%rbp)
      81:   89 4d ac    movl    %ecx, -84(%rbp)
      84:   0f 85 08 00 00 00   jne 8 <_main+0x92>
      8a:   8b 45 ac    movl    -84(%rbp), %eax
      8d:   48 89 ec    movq    %rbp, %rsp
      90:   5d  popq    %rbp
      91:   c3  retq
      92:   e8 00 00 00 00  callq   0 <_main+0x97>

Is there some evidence in this generated file that shows to me what happens with the variable arr[n]? I don't know how to read it, despiste having some notion of what is written.

Some of my references to this question: What does “Memory allocated at compile time” really mean?, Array allocation in compiler, Using GCC to produce readable assembly?, Array[n] vs Array[10] - Initializing array with variable vs real number

José Joaquim
  • 103
  • 2
  • 9
  • We are not a "explain my code" site. Get the instruction set reference and have fun! – too honest for this site May 23 '17 at 13:41
  • Always compile with optimizations enabled. Without optimizations, the code emitted by the compiler is full of useless junk that hinders understandment. Also, it's “code,” not “a code.” Code is not countable. – fuz May 23 '17 at 13:44
  • The code was just to make my question more clear, since i don't even know how to ask right. I added three of some my references that i read before ask. I don't want some explanation of the code, but an explanation of what happens, in compilation time, about what i described. – José Joaquim May 23 '17 at 13:45
  • AFAIK every memory allocation if not defined in compilation time (in this case with some constant or define to give the number of items) shall be handled with dynamic allocation (malloc or similar using the heap), even local variables that are allocated in stack shall have known sizes to give the compiler how many addresses it should jump (reserve) in stack when the function is called.. so, I am curious to see how the compiler handles your code! EDIT: right, as Ped7g pointed out, is just about readjusting the stack offset to allocate more memory during execution, learned something today! – Gustavo Laureano May 23 '17 at 13:52

1 Answers1

1
   67:   48 89 fc    movq    %rdi, %rsp

Will adjust stack pointer to further move it down (i.e. "allocate local stack memory by dynamic size"). (I just searched for second rsp adjustment, I didn't bother to fully decipher the code, as the debug code makes me feel sick, but I'm 95% sure this is it)

The local variables live in stack memory, so their "deallocation" is simple restoration of stack pointer, like here:

   8d:   48 89 ec    movq    %rbp, %rsp  ; *boom* all local memory released in single op.

Actually the much more interesting question regarding your case of dynamic-size array is the remaining code generated working with the dynamic size. The local-memory allocation part is trivial, compared to that. As the main strength of fixed sized arrays is, that the code calculating memory offsets can be optimized (especially for sizes which match some power of two). But with dynamic size the code has to calculate everything in generic fully dynamic way.

Ped7g
  • 16,236
  • 3
  • 26
  • 63
  • Reading a bit more into that assembly, actually I'm in doubt, whether that `movq %rdi, %rsp` is really what you are looking for, but as I wrote in answer, that code is too inefficient to dig through without losing good mood, and the principle used to implementing this should match my answer. – Ped7g May 23 '17 at 13:53
  • There is no need to decipher the hole code, the phrase "allocate local stack memory by dynamic size" was perfect! I'm searching right now about these adjustment (rsp, rdi, etc). One more thing, this "deallocation" is like to pass position by position of the stack and free up the memory until the original position? – José Joaquim May 23 '17 at 13:58
  • @JoséJoaquim: No, it restores the original stack position in single instruction, in this calling convention and way how that C is compiled, the `rbp` works as stack-frame during whole function lifetime, and it does contain the original `rsp` upon entering the function (minus 8, as the previous `rbp` of caller is saved first on stack). So just doing `rsp = rbp` and `pop rbp` is enough to restore everything (stack-wise I mean). (search for `rbp` in the code, when it is set and how it is preserved + used, the pattern should be quite obvious, set only at beginning of call). – Ped7g May 23 '17 at 14:00
  • Yeah! I can see the meaning of the pattern now. The 'minus' thing and the deallocation is very clear to me! I also found one question [Why are rpb and rsp called general purpose registers?](https://stackoverflow.com/questions/36529449/why-are-rpb-and-rsp-called-general-purpose-registers) who helped me to understand the use of these registers. Just for the records, in case of curiosity. – José Joaquim May 23 '17 at 14:21