114

I've got a binary installed on my system, and would like to look at the disassembly of a given function. Preferrably using objdump, but other solutions would be acceptable as well.

From this questions I've learned that I might be able to disassemble part of the code if I only know the boundary addresses. From this answer I've learned how to turn my split debug symbols back into a single file.

But even operating on that single file, and even disassembling all the code (i.e. without start or stop address, but plain -d parameter to objdump), I still don't see that symbol anywhere. Which makes sense insofar as the function in question is static, so it isn't exported. Nevertheless, valgrind will report the function name, so it has to be stored somewhere.

Looking at the details of the debug sections, I find that name mentioned in the .debug_str section, but I don't know a tool which can turn this into an address range.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
MvG
  • 57,380
  • 22
  • 148
  • 276
  • 2
    A minor side note: If a function is marked `static`, it might be inlined by the compiler into its call sites. This may mean there may not actually be any function to disassemble, _per se_. If you can spot symbols for other functions, but not the function you are looking for, this is a strong hint that the function has been inlined. Valgrind may still reference the original pre-inlined function because the ELF file debugging information stores where each individual instruction originated from, even if the instructions are moved elsewhere. – davidg Apr 01 '14 at 03:01
  • @davidg: true, but since the answer by Tom worked in this case, this doesn't seem to be the case. Nevertheless, do you know of a way to e.g. annotate assembly code with that information of where each instruction came from? – MvG Apr 01 '14 at 06:06
  • 1
    Good to hear! `addr2line` will accept PCs/IPs from `stdin` and print out their corresponding source code lines. Similarly, `objdump -l` will mix the objdump with source lines; though for highly optimised code with heavy inlining, the results of either program are not always particularly helpful. – davidg Apr 01 '14 at 08:58

11 Answers11

108

I would suggest using gdb as the simplest approach. You can even do it as a one-liner, like:

gdb -batch -ex 'file /bin/ls' -ex 'disassemble main'
Tom Tromey
  • 21,507
  • 2
  • 45
  • 63
  • 4
    +1 undocumented feature! `-ex 'command'` isn't in [`man gdb`](http://linux.die.net/man/1/gdb)!? But is in fact listed in [gdb docs](http://sourceware.org/gdb/current/onlinedocs/gdb/gdb-man.html#gdb-man). Also for others, stuff like `/bin/ls` might be stripped, so if that exact command displays nothing, try another object! Can also specify file/object as bareword argument; e.g., `gdb -batch -ex 'disassemble main' /bin/ls` – hoc_age Oct 17 '14 at 15:01
  • 3
    The man page isn't definitive. For a long time it wasn't really maintained, but now I think it's generated from the main docs. Also "gdb --help" is more complete now too. – Tom Tromey Oct 18 '14 at 02:33
  • 12
    `gdb /bin/ls -batch -ex 'disassemble main'` works as well – stefanct Sep 21 '16 at 13:30
  • 2
    If you use `column -ts$'\t'` to filter the GDB output, you'll have the raw bytes and source columns nicely aligned. Also, `-ex 'set disassembly-flavor intel'` before other `-ex`s will result in Intel assembly syntax. – Ruslan Oct 04 '18 at 12:39
  • I called `disassemble fn` using the method, above. But it seems that when there are multiple function with the same name in the binary file, only one is disassembled. Is it possible to disassemble all of them or I should disassemble them based on raw address? – TheAhmad Jan 25 '20 at 14:36
  • Just to add if you have a symbol in a namespace you need to put its name in single quotes: ```gdb -batch -ex 'file binary' -ex "disassemble 'namespace::function'"``` Otherwise I get the confusing error `No type "function" within class or namespace "namespace".`. If I use double quotes, I get `You can't do that without a process to debug.` which is likewise confusing. – Simon Aug 26 '20 at 10:33
41

gdb disassemble/rs to show source and raw bytes as well

With this format, it gets really close to objdump -S output:

gdb -batch -ex "disassemble/rs $FUNCTION" "$EXECUTABLE"

main.c

#include <assert.h>

int myfunc(int i) {
    i = i + 2;
    i = i * 2;
    return i;
}

int main(void) {
    assert(myfunc(1) == 6);
    assert(myfunc(2) == 8);
    return 0;
}

Compile and disassemble

gcc -O0 -ggdb3 -std=c99 -Wall -Wextra -pedantic -o main.out main.c
gdb -batch -ex "disassemble/rs myfunc" main.out

Disassembly:

Dump of assembler code for function myfunc:
main.c:
3       int myfunc(int i) {
   0x0000000000001135 <+0>:     55      push   %rbp
   0x0000000000001136 <+1>:     48 89 e5        mov    %rsp,%rbp
   0x0000000000001139 <+4>:     89 7d fc        mov    %edi,-0x4(%rbp)

4           i = i + 2;
   0x000000000000113c <+7>:     83 45 fc 02     addl   $0x2,-0x4(%rbp)

5           i = i * 2;
   0x0000000000001140 <+11>:    d1 65 fc        shll   -0x4(%rbp)

6           return i;
   0x0000000000001143 <+14>:    8b 45 fc        mov    -0x4(%rbp),%eax

7       }
   0x0000000000001146 <+17>:    5d      pop    %rbp
   0x0000000000001147 <+18>:    c3      retq   
End of assembler dump.

Tested on Ubuntu 16.04, GDB 7.11.1.

objdump + awk workarounds

Print the paragraph as mentioned at: https://unix.stackexchange.com/questions/82944/how-to-grep-for-text-in-a-file-and-display-the-paragraph-that-has-the-text

objdump -d main.out | awk -v RS= '/^[[:xdigit:]]+ <FUNCTION>/'

e.g.:

objdump -d main.out | awk -v RS= '/^[[:xdigit:]]+ <myfunc>/'

gives just:

0000000000001135 <myfunc>:
    1135:   55                      push   %rbp
    1136:   48 89 e5                mov    %rsp,%rbp
    1139:   89 7d fc                mov    %edi,-0x4(%rbp)
    113c:   83 45 fc 02             addl   $0x2,-0x4(%rbp)
    1140:   d1 65 fc                shll   -0x4(%rbp)
    1143:   8b 45 fc                mov    -0x4(%rbp),%eax
    1146:   5d                      pop    %rbp
    1147:   c3                      retq   

When using -S, I don't think there is a fail-proof way, as the code comments could contain any possible sequence... But the following works almost all the time:

objdump -S main.out | awk '/^[[:xdigit:]]+ <FUNCTION>:$/{flag=1;next}/^[[:xdigit:]]+ <.*>:$/{flag=0}flag'

adapted from: How to select lines between two marker patterns which may occur multiple times with awk/sed

Mailing list replies

There is a 2010 thread on the mailing list which says it is not possible: https://sourceware.org/ml/binutils/2010-04/msg00445.html

Besides the gdb workaround proposed by Tom, they also comment on another (worse) workaround of compiling with -ffunction-section which puts one function per section and then dumping the section.

Nicolas Clifton gave it a WONTFIX https://sourceware.org/ml/binutils/2015-07/msg00004.html , likely because the GDB workaround covers that use case.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
  • The gdb approach works fine on shared libraries and object files. – Tom Tromey Aug 04 '15 at 13:14
  • I think nowadays it may be `/rm` instead of `/rs` – cbr Jul 10 '23 at 18:19
  • @cbr `help disas` on a GDB shell explains differences. `/m` help ends in: "This modifier hasn't proved useful in practice and is deprecated in favor of /s." as of GDB 13.1, Ubuntu 23.04. – Ciro Santilli OurBigBook.com Jul 11 '23 at 05:40
  • @CiroSantilliOurBigBook.com Thanks for the correction. I assumed I was using a new-ish GDB but turns out the container I was running inside appeared to have an old GDB that didn't have /s. – cbr Jul 12 '23 at 18:33
41

If you have a very recent binutils (2.32+), this is very simple.

Passing --disassemble=SYMBOL to objdump will disassemble only the specified function. No need to pass the start address and the end address.

LLVM objdump also has a similar option (--disassemble-symbols).

Léo Lam
  • 3,870
  • 4
  • 34
  • 44
  • Thank you. Changelog for binutils 2.32, 02 Feb 2019: https://lists.gnu.org/archive/html/info-gnu/2019-02/msg00000.html "*Objdump's --disassemble option can now take a parameter, specifying the starting symbol for disassembly. Disassembly will continue from this symbol up to the next symbol or the end of the function.*" – osgx Sep 30 '20 at 07:39
  • 1
    Works on ARM gcc toolchain 9-2020-q2-update – personal_cloud Dec 31 '22 at 19:57
17

Disassemble One Single Function using Objdump

I have two solutions:

1. Commandline Based

This method works perfectly and additional a simple one. I use objdump with the -d flag and pipe it through awk. The disassembled output looks like

000000000000068a <main>:
68a:    55                      push   %rbp
68b:    48 89 e5                mov    %rsp,%rbp
68e:    48 83 ec 20             sub    $0x20,%rsp

To start with, I begin with the description of the objdump output. A section or function is separated by an empty line. Therefore changing the FS (Field Separator) to newline and the RS (Record Separator) to twice newline let you easily search for your recommended function, since it is simply to find within the $1 field!

objdump -d name_of_your_obj_file | awk -F"\n" -v RS="\n\n" '$1 ~ /main/'

Of course you can replace main with any other function you would like to print.

2. Bash Script

I have written a small bash script for this issue. Paste and copy it and save it as e.g. dasm file.

#!/bin/bash
# Author: abu
# filename: dasm
# Description: puts disassembled objectfile to std-out

if [ $# = 2 ]; then
        sstrg="^[[:xdigit:]]{2,}+.*<$2>:$"
        objdump -d $1 | awk -F"\n" -v RS="\n\n" '$1 ~ /'"$sstrg"'/'
elif [ $# = 1 ]; then
        objdump -d $1 | awk -F"\n" -v RS="\n\n" '{ print $1 }'
else
    echo "You have to add argument(s)"
    echo "Usage:   "$0 " arg1 arg2"  
    echo "Description: print disassembled label to std-out"
    echo "             arg1: name of object file"
    echo "             arg2: name of function to be disassembled"
    echo "         "$0 " arg1    ... print labels and their rel. addresses" 
fi

Change the x-access and invoke it with e.g.:

chmod +x dasm
./dasm test main

This is much faster than invoking gdb with a script. Beside the way using objdump will not load the libraries into memory and is therefore safer!


Vitaly Fadeev programmed an auto-completion to this script, which is really a nice feature and speeds up typing.

The script can be found here.

abu_bua
  • 1,361
  • 17
  • 25
  • It seems it depends if `objdump` or `gdb` is faster. For a huge binary (Firefox' libxul.so) `objdump` takes forever, I cancelled it after an hour, while `gdb` takes less than a minute. – Simon Aug 26 '20 at 10:38
5

To simplify the usage of awk for parsing objdump's output relative to other answers:

objdump -d filename | sed '/<functionName>:/,/^$/!d'
Nathan Tuggy
  • 2,237
  • 27
  • 30
  • 38
fcr
  • 160
  • 1
  • 3
4

This works just like the gdb solution (in that that it shifts the offsets towards zero) except that it's not laggy (gets the job done in about 5ms on my PC whereas the gdb solution takes about 150ms):

objdump_func:

#!/bin/sh
# $1 -- function name; rest -- object files
fn=$1; shift 1
exec objdump -d "$@" | 
awk " /^[[:xdigit:]].*<$fn>/,/^\$/ { print \$0 }" |
awk -F: -F' '  'NR==1 {  offset=strtonum("0x"$1); print $0; } 
                NR!=1 {  split($0,a,":"); rhs=a[2]; n=strtonum("0x"$1); $1=sprintf("%x", n-offset); printf "%4s:%s\n", $1,rhs }'
Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
  • I can't test just now, but I'm looking forward to when I get round to this. Can you elaborate a bit on the “shifts offset towards zero” aspect? I didn't see this explicit in the gdb answers here, and I'd like to hear a bit more about what's actually going on there and why. – MvG Aug 07 '16 at 19:59
  • It basically makes it look as if the function you target (which is what the first `awk` does) was the only function in the object file, that is, even if the function starts at, say `0x2d`, the second awk will shift it towards `0x00` (by subtracting `0x2d` from the address of each instruction), which is useful because the assembly code often makes references relative to the start of the function and if the function starts at 0, you don't have to do the subtractions in your head. The awk code could be better but at least it does the job and is fairly efficient. – Petr Skocik Aug 07 '16 at 20:25
  • In retrospect it seems compiling with `-ffunction-sections` is an easier way to make sure each function starts at 0. – Petr Skocik Jan 21 '20 at 09:40
3

Bash completion for ./dasm

Complete symbol names to this solution (D lang version):

  • By typing dasm test and then pressing TabTab, you will get a list of all functions.
  • By typing dasm test m and then pressing TabTab all functions starting with m will be shown, or in case only one function exists, it will be autocompleted.

File /etc/bash_completion.d/dasm:

# bash completion for dasm
_dasm()
{
    local cur=${COMP_WORDS[COMP_CWORD]}

    if [[ $COMP_CWORD -eq 1 ]] ; then
    # files
    COMPREPLY=( $( command ls *.o -F 2>/dev/null | grep "^$cur" ) )

    elif [[ $COMP_CWORD -eq 2 ]] ; then
    # functions
    OBJFILE=${COMP_WORDS[COMP_CWORD-1]}

    COMPREPLY=( $( command nm --demangle=dlang $OBJFILE | grep " W " | cut -d " " -f 3 | tr "()" "  " | grep "$cur" ) )

    else
    COMPREPLY=($(compgen -W "" -- "$cur"));
    fi
}

complete -F _dasm dasm
abu_bua
  • 1,361
  • 17
  • 25
Vitaly Fadeev
  • 886
  • 10
  • 13
1

Not exactly what you asked, but if you are compiling a C or C++ program from source with GCC, you can add a function attribute to put it in a custom named section of the binary:

extern __attribute__((noinline, section("disasm"))) void foo() {}

Then you can ask objdump to show only functions in that named section with -jdisasm.

Boann
  • 48,794
  • 16
  • 117
  • 146
0

maybe this is easy to do:
objdump -d libxxx.so | grep -A 50 func_name_to_be_searched

galian
  • 832
  • 6
  • 12
0

Just use objdump -d filename | awk '/<funcname>/,/^$/'

kingkong
  • 119
  • 8
  • 2
    There are eight existing answers to this question, including an accepted answer with 94 upvotes. Are you sure your answer hasn't already been provided? If not, why might someone prefer your approach over the existing approaches proposed? Are you taking advantage of new capabilities? Are there scenarios where your approach is better suited? – Jeremy Caney Nov 23 '21 at 00:36
  • Well, thanks for your comments. I haven't seen other answer but just left mine. At the same time, I am not intended to ask for upvotes. – kingkong Nov 23 '21 at 05:05
  • But again, what advantage does your solution offer over the others? Can you [edit] those details into your answer? This looks a lot like the `sed` solution. Why use `awk` over `sed`? – General Grievance Nov 23 '21 at 14:46
0

In gcc-objdump,it can be objdump -C --disassemble="funcName" -j.text procName

zintown
  • 1
  • 1
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Oct 28 '22 at 07:20