16

Where/how does Apples GCC store DWARF inside an executable?

I compiled a binary via gcc -gdwarf-2 (Apples GCC). However, neither objdump -g nor objdump -h does show me any debug information.

Also libbfd does not find any debug information. (I asked on the binutils-mailinglist about it here.)

I am able however to extract the debugging information via dsymutil (into a dSYM). libbfd is also able to read those debug info then.

Albert
  • 65,406
  • 61
  • 242
  • 386

4 Answers4

60

On Mac OS X there was a decision to have the linker ld not process all of the debug information when you link your program. The debug information is often 10xthe size of the program executable so having the linker process all of the debug info and include it in the executable binary was a serious detriment to link times. For iterative development - compile, link, compile, link, debug, compile link - this was a real hit.

Instead:

  • The compiler generates the DWARF debug information in the .s files, the assembler outputs it in the .o files
  • The linker includes a "debug map" in the executable binary which tells debug info users where all of the symbols were relocated during the link.

A consumer (doing .o-file debugging) loads the debug map from the executable and processes all of the DWARF in the .o files as-needed, remapping the symbols as per the debug map's instructions.

dsymutil can be thought of as a debug info linker. It does this same process -- read the debug map, load the DWARF from the .o files, relocate all the addresses -- and then outputs a single binary of all the DWARF at their final, linked addresses. This is the dSYM bundle.

Once you have a dSYM bundle, you've got plain old standard DWARF that any dwarf reading tool (which can deal with Mach-O binary files) can process.

There is an additional refinement that makes all of this work, the UUIDs included in Mach-O binaries. Every time the linker creates a binary, it emits a 128-bit UUID in the LC_UUID load command (v. otool -hlv or dwarfdump --uuid). This uniquely identifies that binary file. When dsymutil creates the dSYM, it includes that UUID. The debuggers will only associate a dSYM and an executable if they have matching UUIDs -- no dodgy file mod timestamps or anything like that.

We can also use the UUIDs to locate the dSYMs for binaries. They show up in crash reports, we include a Spotlight importer that you can use to search for them, e.g. mdfind "com_apple_xcode_dsym_uuids == E21A4165-29D5-35DC-D08D-368476F85EE1" if the dSYM is located in a Spotlight indexed location. You can even have a repository of dSYMs for your company and a program that can retrieve the correct dSYM given a UUID - maybe a little mysql database or something like that - so you run the debugger on a random executable and you instantly have all the debug info for that executable. There are some pretty neat things you can do with the UUIDs.

But anyway, to answer your original question: The unstripped binary has the debug map, the .o files have the DWARF, and when dsymutil is ran these are combined to create the dSYM bundle.

If you want to see the debug map entries, do nm -pa executable and they're all there. They are in the form of the old stabs nlist records - the linker already knew how to process stabs so it was easiest to use them - but you'll see how it works without much trouble, maybe refer to some stabs documentation if you're uncertain.

mfaani
  • 33,269
  • 19
  • 164
  • 293
Jason Molenda
  • 14,835
  • 1
  • 59
  • 61
  • 1
    How does `dsymutil` know where the .o files are? I see no option in the manpage to tell it. Also do I need to compile the binary `-g3`, and if so, can I strip it after I've `dsymutil`'d it? Thanks. – mxcl Jun 25 '13 at 21:03
  • 4
    There are "debug map" entries in the executable before it is stripped with the filenames of the .o files. `nm -pa binary | grep OSO` will list them. They are in the form of the old stabs debug format (because the linker already knew how to handle that format). After you've created your dSYM, you can strip them out of the executable. You shouldn't need to use `-g3` on the Mac platform, `-g` should be sufficient. I think `-g3` outputs preprocess macro information but lldb doesn't read that on Mac OS X (and I don't know if clang even outputs it.) – Jason Molenda Jun 25 '13 at 21:21
  • 2
    @JasonMolenda What a great answer, thanks! Besides yours and [this](http://wiki.dwarfstd.org/index.php?title=Apple's_%22Lazy%22_DWARF_Scheme) anecdotal write-ups I can’t seem to find any formal ones, do you know if there are any? – alloy Jan 15 '15 at 13:34
  • 1
    Thanks. Heh, I wrote that other anecdotal writeup you linked to, too. :) T wasn't ever officially documented in any form because the audience for it is so small. All of the relevant pieces are open source (compiler, linker, debugger) so the of course the source code is all available. The DWARF standards committee is working on various accelerated table and dwarf-in-.o-file proposals. When that all settles & is standardized, I think we'll see if we can transition the tools towards using that. But that's nothing that'll be happening in the near-term. – Jason Molenda Jan 16 '15 at 09:48
  • Well that explains a lot :D And you’re right of course, the audience is very small. – alloy Jan 16 '15 at 14:31
  • Since you know a lot about this, [is there a way to get the "slow" behaviour and have the linker embed all the debug info into the executable?](https://stackoverflow.com/questions/49083800/clang-link-debug-info-in-executable-on-osx). – Timmmm Mar 04 '18 at 12:33
2

It seems it actually doesn't.

I traced dsymutil and it reads all the *.o files. objdump -h also lists all the debug info in them.

So it seems that those info isn't copied over to the binary.


Some related comments about this can also be found here.

Albert
  • 65,406
  • 61
  • 242
  • 386
2

It seems there are two ways for OSX to place debug information:

  1. In the .o object files used for compilation. The binary stores a reference to these files (by absolute path).

  2. In a separate bundle (directory) called .dSYM

If I compile with Apple's Clang using g++ -g main.cpp -o foo I get the bundle called foo.dSYM. However if I use CMake I get the debug info in the object files. I guess because it does a separate gcc -c main.cpp -o main.o step?

Anyway I found this command very useful for case 1:

$ dsymutil -dump-debug-map main
---
triple:          'x86_64-apple-darwin'
binary-path:     main
objects:         
  - filename:        /Users/tim/foo/build/CMakeFiles/main.dir/main.cpp.o
    timestamp:       1485951213
    symbols:         
      - { sym: __ZNSt3__111char_traitsIcE11eq_int_typeEii, objAddr: 0x0000000000000D50, binAddr: 0x0000000100001C90, size: 0x00000020 }
      - { sym: __ZNSt3__111char_traitsIcE6lengthEPKc, objAddr: 0x0000000000000660, binAddr: 0x00000001000015A0, size: 0x00000020 }
      - { sym: GCC_except_table3, objAddr: 0x0000000000000DBC, binAddr: 0x0000000100001E2C, size: 0x00000000 }
      - { sym: _main, objAddr: 0x0000000000000000, binAddr: 0x0000000100000F40, size: 0x00000090 }
      - { sym: __ZNSt3__124__put_character_sequenceIcNS_11char_traitsIcEEEERNS_13basic_ostreamIT_T0_EES7_PKS4_m, objAddr: 0x00000000000001F0, binAddr: 0x0000000100001130, size: 0x00000470 }
      - { sym: ___clang_call_terminate, objAddr: 0x0000000000000D40, binAddr: 0x0000000100001C80, size: 0x00000010 }
      - { sym: GCC_except_table5, objAddr: 0x0000000000000E6C, binAddr: 0x0000000100001EDC, size: 0x00000000 }
      - { sym: __ZNSt3__116__pad_and_outputIcNS_11char_traitsIcEEEENS_19ostreambuf_iteratorIT_T0_EES6_PKS4_S8_S8_RNS_8ios_baseES4_, objAddr: 0x0000000000000680, binAddr: 0x00000001000015C0, size: 0x000006C0 }
      - { sym: __ZNSt3__14endlIcNS_11char_traitsIcEEEERNS_13basic_ostreamIT_T0_EES7_, objAddr: 0x00000000000000E0, binAddr: 0x0000000100001020, size: 0x00000110 }
      - { sym: GCC_except_table2, objAddr: 0x0000000000000D7C, binAddr: 0x0000000100001DEC, size: 0x00000000 }
      - { sym: __ZNSt3__1lsINS_11char_traitsIcEEEERNS_13basic_ostreamIcT_EES6_PKc, objAddr: 0x0000000000000090, binAddr: 0x0000000100000FD0, size: 0x00000050 }
      - { sym: __ZNSt3__111char_traitsIcE3eofEv, objAddr: 0x0000000000000D70, binAddr: 0x0000000100001CB0, size: 0x0000000B }
...
Timmmm
  • 88,195
  • 71
  • 364
  • 509
-1

Apple stores debugging information in separate files named *.dSYM. You can run dwarfdump on those files and see the DWARF Debug Info Entries.

Bogatyr
  • 19,255
  • 7
  • 59
  • 72
  • 1
    No. You can *create* the dSYM via `dsymutil`. But my question was, where are the debug information. I.e. where does `dsymutil` get it from. But I already have the answer (see my own answer). They aren't actually in the binary, the binary references the `*.o` files and that is also where `dsymutil` is getting the data from. – Albert Apr 14 '12 at 22:45
  • Well your question was vague. I interpreted it to mean where is the debug information stored when used with an executable in the debugger. – Bogatyr Apr 15 '12 at 12:14
  • Yea, that was my question. And the answer is, it is stored in the `*.o` files. And when you create a dSYM, then of course you have another copy in the dSYM. But before creating the dSYM, there is no dSYM. That is what I said in my question. ("I am able however to extract the debugging information via `dsymutil`.") You don't automatically get the dSYM. You have to call `dsymutil`. (I think, when you use Xcode, Xcode does that automatically.) Or am I wrong there? – Albert Apr 15 '12 at 18:06