Clang's ASM output vs GCC's

Question

(I don't know almost anything about assembly language yet).

I'm trying to follow this tutorial.

The problem is that his compiler, and my test setup (gcc on Linux 32 bit) produces completely different, and significantly less output than my main setup (clang on OSX 64 bit).

Here are my outputs for int main() {}

gcc on Linux 32 bit

$ cat blank.c
int main() {}
$ gcc -S blank.c              
$ cat blank.s
    .file   "blank.c"
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    popl    %ebp
    .cfi_def_cfa 4, 4
    .cfi_restore 5
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
    .section    .note.GNU-stack,"",@progbits

clang on Mac OSX 64 bit

$ cat blank.c
int main() {}
$ clang -S blank.c
$ cat blank.s
    .section    __TEXT,__text,regular,pure_instructions
    .globl  _main
    .align  4, 0x90
_main:                                  ## @main
Leh_func_begin0:
## BB#0:
    pushq   %rbp
Ltmp0:
    movq    %rsp, %rbp
Ltmp1:
    movl    $0, %eax
    popq    %rbp
    ret
Leh_func_end0:

    .section    __TEXT,__eh_frame,coalesced,no_toc+strip_static_syms+live_support
EH_frame0:
Lsection_eh_frame0:
Leh_frame_common0:
Lset0 = Leh_frame_common_end0-Leh_frame_common_begin0 ## Length of Common Information Entry
    .long   Lset0
Leh_frame_common_begin0:
    .long   0                       ## CIE Identifier Tag
    .byte   1                       ## DW_CIE_VERSION
    .asciz   "zR"                   ## CIE Augmentation
    .byte   1                       ## CIE Code Alignment Factor
    .byte   120                     ## CIE Data Alignment Factor
    .byte   16                      ## CIE Return Address Column
    .byte   1                       ## Augmentation Size
    .byte   16                      ## FDE Encoding = pcrel
    .byte   12                      ## DW_CFA_def_cfa
    .byte   7                       ## Register
    .byte   8                       ## Offset
    .byte   144                     ## DW_CFA_offset + Reg (16)
    .byte   1                       ## Offset
    .align  3
Leh_frame_common_end0:
    .globl  _main.eh
_main.eh:
Lset1 = Leh_frame_end0-Leh_frame_begin0 ## Length of Frame Information Entry
    .long   Lset1
Leh_frame_begin0:
Lset2 = Leh_frame_begin0-Leh_frame_common0 ## FDE CIE offset
    .long   Lset2
Ltmp2:                                  ## FDE initial location
Ltmp3 = Leh_func_begin0-Ltmp2
    .quad   Ltmp3
Lset3 = Leh_func_end0-Leh_func_begin0   ## FDE address range
    .quad   Lset3
    .byte   0                       ## Augmentation size
    .byte   4                       ## DW_CFA_advance_loc4
Lset4 = Ltmp0-Leh_func_begin0
    .long   Lset4
    .byte   14                      ## DW_CFA_def_cfa_offset
    .byte   16                      ## Offset
    .byte   134                     ## DW_CFA_offset + Reg (6)
    .byte   2                       ## Offset
    .byte   4                       ## DW_CFA_advance_loc4
Lset5 = Ltmp1-Ltmp0
    .long   Lset5
    .byte   13                      ## DW_CFA_def_cfa_register
    .byte   6                       ## Register
    .align  3
Leh_frame_end0:


.subsections_via_symbols

Is it possible to generate similar assembly output on my Mac, so I can follow the tutorial? or is this assembly code platform-specific? And if it is, what flags on clang can I use to generate less verbose/boilerplate(?) code?

Note: there is no reason to expect two different compilers to produce the same output. not just gcc vs clang but one version to the next. so the version in the tutorial vs the version you have on your computer. — old_timer, Nov 17 '12 at 16:09

scottt · Answer 1 · 2012-11-17T18:01:22.557

6

Make sure you instruct clang to generate 32 bit code with clang -m32 on Mac OSX 64 bit and you basically don't have to worry about the other differences.

Both the .cfi_XXX directives in the gcc output and the lines after .section __TEXT,__eh_frame in the clang output are used to generate the .eh_frame section for stack unwinding. For details, see: http://blog.mozilla.org/respindola/2011/05/12/cfi-directives/

edited Nov 17 '12 at 18:01

answered Nov 17 '12 at 15:25

scottt

7,008
27
37

Sorry for a noob question, but is 64 bit assembly always more verbose than 32? Can it be more concise? `clang -m32` given me a nice useful output, but it has to be compiled with `-m32` and would probably have limits of a 32 bit program too. – user1527166 Nov 17 '12 at 15:47
1

@user1527166, "Is 64 bit assembly more verbose?" is really the wrong question to ask in your situation. The tutorial you're trying to follow generates 32 bit x86 code that would not work as-is for x86-64. E.g. in step 2 of "Writing a compiler in Ruby" tutorial, he tries to print "Hello World" by calling the `puts()` function. The assembly code used to call a C function is different on x86 and x86-64. Try reading chapters 3 and 4 of this book first http://download.savannah.gnu.org/releases/pgubook/ Then read https://www.cs.washington.edu/education/courses/401/11au/lectures/M-x86-64-project.pdf – scottt Nov 17 '12 at 15:57

score 2 · Answer 2 · answered Nov 17 '12 at 15:21

Compile your program with gcc -fno-asynchronous-unwind-tables. Or just ignore various .cfi_XYZ directives. For the clang case, just don't pay attention to the __eh_frame section. Bear in mind that it's rather uncommon for two different compilers to generate identical code, even from identical source.

Clang's ASM output vs GCC's

2 Answers2

Linked