22

Ok, this is gonna be a long question. I'm trying to understand how "buffer overflow" works. I am reading Smashing the stack for fun and profit by aleph1 and have just got the disassembly of the following code:

void function(int a, int b, int c) {
   char buffer1[5];
   char buffer2[10];
}

void main() {
  function(1,2,3);
}

The disameembly using -S flag of GCC gives me:

    .file   "example1.c"
    .text
    .globl  function
    .type   function, @function
function:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $48, %rsp
    movl    %edi, -36(%rbp)
    movl    %esi, -40(%rbp)
    movl    %edx, -44(%rbp)
    movq    %fs:40, %rax
    movq    %rax, -8(%rbp)
    xorl    %eax, %eax
    movq    -8(%rbp), %rax
    xorq    %fs:40, %rax
    je  .L2
    call    __stack_chk_fail
.L2:
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   function, .-function
    .globl  main
    .type   main, @function
main:
.LFB1:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movl    $3, %edx
    movl    $2, %esi
    movl    $1, %edi
    call    function
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE1:
    .size   main, .-main
    .ident  "GCC: (Ubuntu 4.8.2-19ubuntu1) 4.8.2"
    .section    .note.GNU-stack,"",@progbits

the .cfi directives are not in the paper by Aleph1 and I guess that they were not used back then. I have read this question on SO and I get that they are used by GCC for exception handling. I have also read another question on SO and I get that .LFB0, .LFE0, .LFE1 and .LFB1 are labels however I have the following doubts:

  1. I get that .cfi directives are used for exception handling however I don't understand what they mean. I have been here and I see some definitions like:

.cfi_def_cfa register, offset

.cfi_def_cfa defines a rule for computing CFA as: take address from register and add offset to it.

However, if you take a look at the disassembly that I have put above you don't find any register name (like EAX, EBX and so on) instead you find a number there (I have generally found '6') and I don't know how's that supposed to be a register. Especially, can anyone explain what .cfi_def_cfa_offset 16, .cfi_offset 6, -16, .cfi_def_cfa_register 6 and .cfi_def_cfa 7, 8 mean? Also, what does CFA mean? I am asking this because mostly in books/papers the procedure prolog is like :

 pushl %ebp
 movl %esp,%ebp
 subl $20,%esp

However, now I think the procedure prolog in modern computers is as follows:

    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $48, %rsp

Initially I thought that the CFI directives are used instead of sub mnemonic to set the offset but that's not the case; the sub command is still being used in spite of using the CFI directives.

  1. I understood that there are labels for each procedure. However, why are multiple nested labels inside a procedure? In my case main has .LFB1 and .LFE2 labels. What is the need for multiple labels? Similarly the function procedure has the labels .LFB0, .L2 and .LFE0

  2. The last 3 lines for both the procedures seem to be used for some housekeeping functions (telling the size of the procedure, maybe?) but I am not sure what do they mean. Can anyone explain what do they mean and what's their use?

EDIT:

(adding one more question)

  1. Do the CFI directives take up any space? Because in the procedure "function", each int parameter take up 4 bytes and the number of it is 3, so all parameter takes 12 bytes in memory. Next, the first char array takes 8 bytes (round up 5bytes to 8bytes), and next char array takes 12bytes (round up 10bytes to 12bytes), so the whole char array takes 20 bytes. Summing these all, parameter and local variables only need 12+20=32 bytes.

    But in the procedure "function", compiler subtract 48 bytes to store values. Why?

Community
  • 1
  • 1
Pervy Sage
  • 841
  • 1
  • 10
  • 22
  • Those are macros that GAS recognizes and uses to emit unwind-related data. The Itanium ABI for C++ describes what that entails (GCC implements that ABI). The unwind data is stored somewhere else (look for `eh`-frame sections in the object file); it is located essentially by an associative lookup using the instruction pointer as the key. – Kerrek SB Jun 27 '14 at 23:38
  • Do you get different results if you compile as C vs. as C++? Also, never ever `void main()`. It's `int main()`. – aschepler Jun 27 '14 at 23:38
  • This pages explains using cfi directives for Dwarf-2 debugging info, which avoids a frame pointer - http://www.logix.cz/michal/devel/gas-cfi/ – Rob11311 Jun 27 '14 at 23:49
  • 2
    You can compile without unwind tables to get rid of them (use `-fno-asynchronous-unwind-tables`). That said, they don't affect the generated machine code in any way, they definitely won't "replace" any `sub` instructions. You can safely ignore them. – Jester Jun 28 '14 at 00:01
  • Related: [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) re: filtering these and other directives – Peter Cordes Oct 19 '22 at 07:30

3 Answers3

19

CFI stands for call frame information. It's the way the compiler describes what happens in a function. It can be used by the debugger to present a call stack, by the linker to synthesise exceptions tables, for stack depth analysis and other things like that.

Effectively, it describes where resources such as processor registers are stored and where the return address is.

CFA stands for call frame address, which mean the address the stack pointer location of the caller function. This is needed to pick up information about the next frame on the stack.

Lindydancer
  • 25,428
  • 4
  • 49
  • 68
  • Ok. But why is the .cfi_def_cfa_offset, .cfi_offset, .cfi_def_cfa_register same in both the procedues? – Pervy Sage Jun 28 '14 at 21:46
  • Also do the CFI directives take up any space? Because in the procedure "function" the 3 ints take up 4 bytes each and the char array should take 20 bytes which adds up to 12+20 = 32 bytes. So why subtract 48? – Pervy Sage Jun 28 '14 at 22:15
  • The cfi directives does not take up any space by themselves, they only provide meta information. However, if the linker synthesise exception tables, they will take up space, however, they will be placed in a different location than the actual code. – Lindydancer Jun 29 '14 at 05:19
  • Ok. if you know about it, can you please explain why subtract 48 then? – Pervy Sage Jun 29 '14 at 05:46
  • Sorry, I can't. I've worked with may architectures over the past years, but I have never worked with x86. It goes down to how the char arrays are aligned and padded in memory, if the function need any temporary variables (e.g. for the stack check code) etc. Maybe you can post is as a separate SO question? – Lindydancer Jun 29 '14 at 06:02
6

Lindy Dancer Answered what cfi and cfa means (call frame information ) and (call frame address )

.L<num> denotes labels as per various tidbits in Google in x64 GCC names all labels in the following format start with .L and end with a numeral so .L1 , .L2 , .L....infinity are labels

according to Google and some earlier SO answers BF<num> indicates Function-Begin and EF<num> indicates FUNCTION-END

so .LBF0 , .LBF1 . LBF.....infinity and .LFE0 ,......., .LFE....infinity

denotes function begins and function ends in each function which the compiler probably requires to take care of some internal needs so you should forget them at this moment unless there is a very grave need to dig into compiler internals

the other label .L2 exists to address the branching instruction je in your function

je  .L2

also every compiler aligns and pads the access to arguments and locals to certain boundary

i can't be sure but x64 default align is 16 bytes I think for GCC so if you request an odd reservation like

char foo[5] or
BYTE blah [10]

the indices 5 and 10 are not aligned even for x86

for 5 x86 compiler will assign 8 bytes and for 10 16 bytes

like wise x64 gcc might assign 16 bytes for each of your requests

you actually shouldn't be worrying about why compiler does what it does

when you are trying to understand logic of assembly just concentrate on addresses

if the compiler decided that it will put x at rbp +/- X it will also access it at the same location through out the scope or life of that variable

miken32
  • 42,008
  • 16
  • 111
  • 154
blabb
  • 8,674
  • 1
  • 18
  • 27
2

The 48 is to skip over both the arguments and the locals. The 5 byte array is aligned on an 8 byte boundary, and the 10 byte on a 16 byte boundary. The arguments take 8 bytes each, so 3*8 for arguments plus 8 + 16 for locals gives 24+24 or 48. You can see it in gdb just by asking for the address of each of those things.

phorgan1
  • 1,664
  • 18
  • 18