315

I was wondering how to use GCC on my C source file to dump a mnemonic version of the machine code so I could see what my code was being compiled into. You can do this with Java but I haven't been able to find a way with GCC.

I am trying to re-write a C method in assembly and seeing how GCC does it would be a big help.

Ryan Tenney
  • 1,812
  • 3
  • 16
  • 29
James
  • 3,682
  • 3
  • 22
  • 21
  • 31
    note that 'bytecode' typically means the code consumed by a VM, like JVM or .NET's CLR. The output of GCC is better called 'machine code', 'machine language', or 'assembly language' – Javier Aug 17 '09 at 19:27
  • 2
    I added an answer using godbolt since it is a very powerful tool for rapidly experimenting with how different options effect your code generation. – Shafik Yaghmour Sep 12 '14 at 02:35
  • http://stackoverflow.com/a/19083877/995714 – phuclv Nov 30 '14 at 06:33
  • Possible duplicate of [How do you get assembler output from C/C++ source in gcc?](http://stackoverflow.com/questions/137038/how-do-you-get-assembler-output-from-c-c-source-in-gcc) – Ciro Santilli OurBigBook.com Oct 15 '15 at 20:21
  • 1
    For more tips on making the asm output human readable, see also: [How to remove “noise” from GCC/clang assembly output?](http://stackoverflow.com/a/38552509/224132) – Peter Cordes Sep 05 '16 at 20:46
  • 1
    Answered here: https://stackoverflow.com/questions/137038/how-do-you-get-assembler-output-from-c-c-source-in-gcc Use the -S option to gcc (or g++). – knowledge_is_power Jul 26 '17 at 19:38

11 Answers11

393

If you compile with debug symbols (add -g to your GCC command line, even if you're also using -O31), you can use objdump -S to produce a more readable disassembly interleaved with C source.

>objdump --help
[...]
-S, --source             Intermix source code with disassembly
-l, --line-numbers       Include line numbers and filenames in output

objdump -drwC -Mintel is nice:

  • -r shows symbol names on relocations (so you'd see puts in the call instruction below)
  • -R shows dynamic-linking relocations / symbol names (useful on shared libraries)
  • -C demangles C++ symbol names
  • -w is "wide" mode: it doesn't line-wrap the machine-code bytes
  • -Mintel: use GAS/binutils MASM-like .intel_syntax noprefix syntax instead of AT&T
  • -S: interleave source lines with disassembly.

You could put something like alias disas="objdump -drwCS -Mintel" in your ~/.bashrc. If not on x86, or if you like AT&T syntax, omit -Mintel.


Example:

> gcc -g -c test.c
> objdump -d -M intel -S test.o

test.o:     file format elf32-i386


Disassembly of section .text:

00000000 <main>:
#include <stdio.h>

int main(void)
{
   0:   55                      push   ebp
   1:   89 e5                   mov    ebp,esp
   3:   83 e4 f0                and    esp,0xfffffff0
   6:   83 ec 10                sub    esp,0x10
    puts("test");
   9:   c7 04 24 00 00 00 00    mov    DWORD PTR [esp],0x0
  10:   e8 fc ff ff ff          call   11 <main+0x11>

    return 0;
  15:   b8 00 00 00 00          mov    eax,0x0
}
  1a:   c9                      leave  
  1b:   c3                      ret

Note that this isn't using -r so the call rel32=-4 isn't annotated with the puts symbol name. And looks like a broken call that jumps into the middle of the call instruction in main. Remember that the rel32 displacement in the call encoding is just a placeholder until the linker fills in a real offset (to a PLT stub in this case, unless you statically link libc).


Footnote 1: Interleaving source can be messy and not very helpful in optimized builds; for that, consider https://godbolt.org/ or other ways of visualizing which instructions go with which source lines. In optimized code there's not always a single source line that accounts for an instruction but the debug info will pick one source line for each asm instruction.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Bastien Léonard
  • 60,478
  • 20
  • 78
  • 95
128

If you give GCC the flag -fverbose-asm, it will

Put extra commentary information in the generated assembly code to make it more readable.

[...] The added comments include:

  • information on the compiler version and command-line options,
  • the source code lines associated with the assembly instructions, in the form FILENAME:LINENUMBER:CONTENT OF LINE,
  • hints on which high-level expressions correspond to the various assembly instruction operands.
Cristian Ciupitu
  • 20,270
  • 7
  • 50
  • 76
Kasper
  • 2,451
  • 2
  • 17
  • 19
  • But then, I would lost all the switch used for `objdump` - `objdump -drwCS -Mintel`, so how can I use something like `verbose` with `objdump`? So that I can have comments in asm code, as does `-fverbose-asm` in gcc? – Herdsman Jan 10 '20 at 17:08
  • 5
    @Herdsman: you can't. The extra stuff `-fverbose-asm` adds is in the form of comments in the asm syntax of the output, not directives that will put anything extra in the `.o` file. It's all discarded at assemble time. Look at compiler asm output *instead* of disassembly, e.g. on https://godbolt.org/ where you can easily match it up with the source line via mouseover and color highlighting of corresponding source / asm lines. [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) – Peter Cordes May 09 '20 at 19:16
82

Use the -S (note: capital S) switch to GCC, and it will emit the assembly code to a file with a .s extension. For example, the following command:

gcc -O2 -S foo.c

will leave the generated assembly code on the file foo.s.

Ripped straight from http://www.delorie.com/djgpp/v2faq/faq8_20.html (but removing erroneous -c)

Community
  • 1
  • 1
Andrew Keeton
  • 22,195
  • 6
  • 45
  • 72
  • 37
    You shouldn't mix -c and -S, only use one of them. In this case, one is overriding the other, probably depending on the order in which they're used. – Adam Rosenfield Aug 17 '09 at 19:28
  • 4
    @AdamRosenfield Any reference about 'shouldn't mix -c and -S'? If it is true, we may should remind the author and edit it. – Tony Aug 05 '14 at 11:55
  • 6
    @Tony: https://gcc.gnu.org/onlinedocs/gcc/Overall-Options.html#Overall-Options "You can use ... ***one*** of the options -c, -S, or -E to say where gcc is to stop." – Nate Eldredge Apr 10 '16 at 00:32
  • 2
    If you want all the intermediate outputs, use `gcc -march=native -O3 -save-temps`. You can still use `-c` to stop at object-file creation without trying to link, or whatever. – Peter Cordes Jun 02 '18 at 01:21
  • 2
    `-save-temps` is interesting as it dumps in one go the exact code generated code, whereas the other option of calling the compiler with `-S` means compiling twice, and possibly with different options. **But** `-save-temps` dumps all in the current directory, which is kind of messy. Looks like it is more intended as a debug option for GCC rather than a tool to inspect your code. – Stéphane Gourichon Jan 22 '20 at 18:16
  • 2
    @StéphaneGourichon: That's correct; more for debugging / creating compiler bug reports than for this use-case. I never use `-save-temps` for looking at how some source compiled to asm, either `-masm=intel -S -o- | less`, disassemble the `.o` or executable, or put it on https://godbolt.org/. – Peter Cordes May 09 '20 at 19:20
  • Use `-save-temps=obj` to have gcc write the generated `.i` and `.s` files to the same directory to where it writes the `.o` object file. – ndim May 09 '22 at 03:38
54

Using the -S switch to GCC on x86 based systems produces a dump of AT&T syntax, by default, which can be specified with the -masm=att switch, like so:

gcc -S -masm=att code.c

Whereas if you'd like to produce a dump in Intel syntax, you could use the -masm=intel switch, like so:

gcc -S -masm=intel code.c

(Both produce dumps of code.c into their various syntax, into the file code.s respectively)

In order to produce similar effects with objdump, you'd want to use the --disassembler-options= intel/att switch, an example (with code dumps to illustrate the differences in syntax):

 $ objdump -d --disassembler-options=att code.c
 080483c4 <main>:
 80483c4:   8d 4c 24 04             lea    0x4(%esp),%ecx
 80483c8:   83 e4 f0                and    $0xfffffff0,%esp
 80483cb:   ff 71 fc                pushl  -0x4(%ecx)
 80483ce:   55                      push   %ebp
 80483cf:   89 e5                   mov    %esp,%ebp
 80483d1:   51                      push   %ecx
 80483d2:   83 ec 04                sub    $0x4,%esp
 80483d5:   c7 04 24 b0 84 04 08    movl   $0x80484b0,(%esp)
 80483dc:   e8 13 ff ff ff          call   80482f4 <puts@plt>
 80483e1:   b8 00 00 00 00          mov    $0x0,%eax
 80483e6:   83 c4 04                add    $0x4,%esp 
 80483e9:   59                      pop    %ecx
 80483ea:   5d                      pop    %ebp
 80483eb:   8d 61 fc                lea    -0x4(%ecx),%esp
 80483ee:   c3                      ret
 80483ef:   90                      nop

and

$ objdump -d --disassembler-options=intel code.c
 080483c4 <main>:
 80483c4:   8d 4c 24 04             lea    ecx,[esp+0x4]
 80483c8:   83 e4 f0                and    esp,0xfffffff0
 80483cb:   ff 71 fc                push   DWORD PTR [ecx-0x4]
 80483ce:   55                      push   ebp
 80483cf:   89 e5                   mov    ebp,esp
 80483d1:   51                      push   ecx
 80483d2:   83 ec 04                sub    esp,0x4
 80483d5:   c7 04 24 b0 84 04 08    mov    DWORD PTR [esp],0x80484b0
 80483dc:   e8 13 ff ff ff          call   80482f4 <puts@plt>
 80483e1:   b8 00 00 00 00          mov    eax,0x0
 80483e6:   83 c4 04                add    esp,0x4
 80483e9:   59                      pop    ecx
 80483ea:   5d                      pop    ebp
 80483eb:   8d 61 fc                lea    esp,[ecx-0x4]
 80483ee:   c3                      ret    
 80483ef:   90                      nop
Toby Speight
  • 27,591
  • 48
  • 66
  • 103
amaterasu
  • 1,040
  • 7
  • 8
  • What the... `gcc -S -masm=intel test.c` didn't exactly work for me, I got some crossbreed of Intel and AT&T syntax like this: `mov %rax, QWORD PTR -24[%rbp]`, instead of this: `movq -24(%rbp), %rax`. – L̲̳o̲̳̳n̲̳̳g̲̳̳p̲̳o̲̳̳k̲̳̳e̲̳̳ Nov 22 '09 at 04:03
  • 1
    Nice tip. It should be noted this also works when performing parallel output of `.o` and ASM files, i.e. via `-Wa,-ahls -o yourfile.o yourfile.cpp>yourfile.asm` – underscore_d Dec 20 '15 at 21:49
  • Could use `-M` option, it's the same as `--disassembler-options` but much shorter, e.g `objdump -d -M intel a.out | less -N` – Eric Jul 05 '16 at 04:57
35

godbolt is a very useful tool, they list only has C++ compilers but you can use -x c flag in order to get it treat the code as C. It will then generate an assembly listing for your code side by side and you can use the Colourise option to generate colored bars to visually indicate which source code maps to the generated assembly. For example the following code:

#include <stdio.h>

void func()
{
  printf( "hello world\n" ) ;
}

using the following command line:

-x c -std=c99 -O3

and Colourise would generate the following:

enter image description here

Arnie97
  • 1,020
  • 7
  • 19
Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
  • It would be nice to know how godbolt filters work: .LC0, .text, //, and Intel. Intel is easy `-masm=intel` but what about the rest? – Z boson Feb 22 '17 at 08:01
  • I guess it is explained here http://stackoverflow.com/a/38552509/2542702 – Z boson Feb 22 '17 at 08:02
  • godbolt do support C (along with a ton of other languages like Rust, D, Pascal...). It's just that there are much fewer C compilers, so it's still better to use C++ compilers with `-x c` – phuclv Apr 27 '19 at 09:34
  • Why are the strings different between the source and the assembly? The newline has been stripped at the end – Lorraine Mar 19 '21 at 10:38
26

Did you try gcc -S -fverbose-asm -O source.c then look into the generated source.s assembler file ?

The generated assembler code goes into source.s (you could override that with -o assembler-filename ); the -fverbose-asm option asks the compiler to emit some assembler comments "explaining" the generated assembler code. The -O option asks the compiler to optimize a bit (it could optimize more with -O2 or -O3).

If you want to understand what gcc is doing try passing -fdump-tree-all but be cautious: you'll get hundreds of dump files.

BTW, GCC is extensible thru plugins or with MELT (a high level domain specific language to extend GCC; which I abandoned in 2017)

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • maybe mention that the output will be in `source.s`, since a lot of people would expect a printout on the console. – RubenLaguna Jul 02 '15 at 08:41
  • 1
    @ecerulm: `-S -o-` dumps to stdout. `-masm=intel` is helpful if you want to use NASM/YASM syntax. (but it uses `qword ptr [mem]`, rather than just `qword`, so it's more like Intel/MASM than NASM/YASM). http://gcc.godbolt.org/ does a nice job of tidying up the dump: optionally stripping comment-only lines, unused labels, and assembler directives. – Peter Cordes Jan 30 '16 at 23:06
  • 3
    Forgot to mention: If you're looking for "similar to the source but without the noise of store/reload after every source line", then `-Og` is even better than `-O1`. It means "optimize for debugging" and makes asm without too many tricky / hard-to-follow optimizations that does everything the source says. It's been available since gcc4.8, but clang 3.7 still doesn't have it. IDK if they decided against it or what. – Peter Cordes Jan 31 '16 at 13:41
  • Update: clang supports `-Og` now. Also, see [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) for more about writing test functions that compile to interesting asm to look at, and GCC/clang options for looking at their asm output directly, not disassembly. IDK why the answers to this question are mostly about disassembly, not GCC asm output. – Peter Cordes Jan 22 '23 at 14:10
17

You can use gdb for this like objdump.

This excerpt is taken from http://sources.redhat.com/gdb/current/onlinedocs/gdb_9.html#SEC64


Here is an example showing mixed source+assembly for Intel x86:

  (gdb) disas /m main
Dump of assembler code for function main:
5       {
0x08048330 :    push   %ebp
0x08048331 :    mov    %esp,%ebp
0x08048333 :    sub    $0x8,%esp
0x08048336 :    and    $0xfffffff0,%esp
0x08048339 :    sub    $0x10,%esp

6         printf ("Hello.\n");
0x0804833c :   movl   $0x8048440,(%esp)
0x08048343 :   call   0x8048284 

7         return 0;
8       }
0x08048348 :   mov    $0x0,%eax
0x0804834d :   leave
0x0804834e :   ret

End of assembler dump.
agf
  • 171,228
  • 44
  • 289
  • 238
Vishal Sagar
  • 498
  • 2
  • 4
  • 13
  • 1
    archieved link: https://web.archive.org/web/20090412112833/http://sourceware.org:80/gdb/current/onlinedocs/gdb_9.html – vlad4378 May 10 '17 at 17:53
  • And to switch GDB's disassembler to Intel syntax, use `set disassembly-flavor intel` command. – Ruslan May 30 '18 at 16:26
13

Use the -S (note: capital S) switch to GCC, and it will emit the assembly code to a file with a .s extension. For example, the following command:

gcc -O2 -S -c foo.c

codymanix
  • 28,510
  • 21
  • 92
  • 151
5

I haven't given a shot to gcc, but in case of g++, the command below works for me.

  • -g for debug build
  • -Wa,-adhln are passed to assembler for listing with source code
g++ -g -Wa,-adhln src.cpp
Alexey Vazhnov
  • 1,291
  • 17
  • 20
DAG
  • 417
  • 4
  • 7
  • 1
    It works for gcc too! -Wa,... is for command line options for the assembler part (execute in gcc/g++ after C/++ compilation). It invokes as internally (as.exe in Windows). See >as --help as command line to see more help – Hartmut Schorrig Apr 17 '20 at 15:28
2

For risc-v dissasembly, these flags are nice:

riscv64-unknown-elf-objdump -d -S -l --visualize-jumps --disassembler-color=color --inlines

-d: disassemble, most basic flag

-S: intermix source. Note: must use -g flag while compiling

-l: line numbers

--visualize-jumps: fancy arrows, not too useful but why not. Sometimes get's too messy and actually makes reading the source harder. Taken from Peter Cordes's comment: --visualize-jumps=coloris also an option, to use different colors for different arrows

--disassembler-color=color: give the disassembly some color

--inlines: print out inlines

Maybe usefull:

-M numeric: Use numeric reg names instead of abi names, useful if you are doing cpu dev and don't know the abi names by heart

-M no-aliases: don't use psudoinstructions like li and call

Example: main.o:

#include <stdio.h>
#include <stdint.h>

static inline void example_inline(const char* str) {
    for (int i = 0; str[i] != 0; i++)
        putchar(str[i]);
}

int main() {
    printf("Hello world");
    example_inline("Hello! I am inlined");

    return 0;
}

I recommend to use -O0 if you want intermix sources. Intermix sources becomes very messy if using -O2.

Command:

riscv64-unknown-elf-gcc main.c -c -O0 -g
riscv64-unknown-elf-objdump -d -S -l --disassembler-color=color --inlines main.o

Dissasembly:

main.o:     file format elf64-littleriscv


Disassembly of section .text:

0000000000000000 <example_inline>:
example_inline():
/Users/cyao/test/main.c:4
#include <stdio.h>
#include <stdint.h>

static inline void example_inline(const char* str) {
   0:   7179                    addi    sp,sp,-48
   2:   f406                    sd  ra,40(sp)
   4:   f022                    sd  s0,32(sp)
   6:   1800                    addi    s0,sp,48
   8:   fca43c23                sd  a0,-40(s0)

000000000000000c <.LBB2>:
/Users/cyao/test/main.c:5
    for (int i = 0; str[i] != 0; i++)
   c:   fe042623                sw  zero,-20(s0)
  10:   a01d                    j   36 <.L2>

0000000000000012 <.L3>:
/Users/cyao/test/main.c:6 (discriminator 3)
        putchar(str[i]);
  12:   fec42783                lw  a5,-20(s0)
  16:   fd843703                ld  a4,-40(s0)
  1a:   97ba                    add a5,a5,a4
  1c:   0007c783                lbu a5,0(a5)
  20:   2781                    sext.w  a5,a5
  22:   853e                    mv  a0,a5
  24:   00000097                auipc   ra,0x0
  28:   000080e7                jalr    ra # 24 <.L3+0x12>
/Users/cyao/test/main.c:5 (discriminator 3)
    for (int i = 0; str[i] != 0; i++)
  2c:   fec42783                lw  a5,-20(s0)
  30:   2785                    addiw   a5,a5,1
  32:   fef42623                sw  a5,-20(s0)

0000000000000036 <.L2>:
/Users/cyao/test/main.c:5 (discriminator 1)
  36:   fec42783                lw  a5,-20(s0)
  3a:   fd843703                ld  a4,-40(s0)
  3e:   97ba                    add a5,a5,a4
  40:   0007c783                lbu a5,0(a5)
  44:   f7f9                    bnez    a5,12 <.L3>

0000000000000046 <.LBE2>:
/Users/cyao/test/main.c:7
}
  46:   0001                    nop
  48:   0001                    nop
  4a:   70a2                    ld  ra,40(sp)
  4c:   7402                    ld  s0,32(sp)
  4e:   6145                    addi    sp,sp,48
  50:   8082                    ret

0000000000000052 <main>:
main():
/Users/cyao/test/main.c:9

int main() {
  52:   1141                    addi    sp,sp,-16
  54:   e406                    sd  ra,8(sp)
  56:   e022                    sd  s0,0(sp)
  58:   0800                    addi    s0,sp,16
/Users/cyao/test/main.c:10
    printf("Hello world");
  5a:   000007b7                lui a5,0x0
  5e:   00078513                mv  a0,a5
  62:   00000097                auipc   ra,0x0
  66:   000080e7                jalr    ra # 62 <main+0x10>
/Users/cyao/test/main.c:11
    example_inline("Hello! I am inlined");
  6a:   000007b7                lui a5,0x0
  6e:   00078513                mv  a0,a5
  72:   00000097                auipc   ra,0x0
  76:   000080e7                jalr    ra # 72 <main+0x20>
/Users/cyao/test/main.c:13

    return 0;
  7a:   4781                    li  a5,0
/Users/cyao/test/main.c:14
}
  7c:   853e                    mv  a0,a5
  7e:   60a2                    ld  ra,8(sp)
  80:   6402                    ld  s0,0(sp)
  82:   0141                    addi    sp,sp,16
  84:   8082                    ret

PS. There are colors in the dissembled code

Cyao
  • 727
  • 4
  • 18
  • 1
    `--visualize-jumps=color` is also an option, to use different colors for different arrows. This can be useful when branching gets denser, not like your trivial example. e.g. my answer on [Better way than a terminal+objdump to read assembly?](https://stackoverflow.com/q/74793599) shows example output from some code with FP compare+branch (it's actually a bit messy to read without color, given the short branches on unordered next to another branch.) – Peter Cordes Jan 22 '23 at 14:05
1

use -Wa,-adhln as option on gcc or g++ to produce a listing output to stdout.

-Wa,... is for command line options for the assembler part (execute in gcc/g++ after C/++ compilation). It invokes as internally (as.exe in Windows). See

>as --help

as command line to see more help for the assembler tool inside gcc