Actually, gprof can do that. The issue it's encountering is that addresses in the DLL are different from the ones that are recorded in the gmon.out
file.
On an executable, the (virtual) address is fixed, but on DLLs it is not. Don't ask me if it's because of ASLR or something else, but it complexifies post-mortem debugging a lot.
Plus the fact that the gmon.out
file format isn't documented, or that there is a documented format but it doesn't match what we were getting.
But we kind of figured it out...
There's a header, then a lot of zeroes, then data in the end. I don't know about a lot of data but the knowledge I got is enough to convert the gmon.out file into an useable one.
First, you have to print the address of your DLL entrypoint symbol when starting the program, and compare it to the static value given by nm
Let's say your entrypoint is _entry
. In your program (for example C) just do:
printf("entry: %p\n",&entry);
Then use nm
on the DLL (which must have symbols) to get the static value. On Windows:
nm mydll.dll | find "_entry"
Let's say you get 0x1F000000
for the static value and 0x6F000000
for the run-time (printed) value. Then you have a 0x50000000
offset that you must apply (subtract) to your gmon.out
binary file.
So basically the format is pretty simple:
- 2 first 32 bit words are little-endian start & stop addresses.
- following word is little-endian offset of mon data in this very file
- at that offset, you'll find chunks of 12 bytes. Once again you have 2 32 bit words for start & end, then 4 bytes of data.
In the following real-life example I have highlighted the 3 first longwords:

- The first longword is the start address (
0x10121450
)
- The second longword is the end address (
0x1357E590
)
- The third longword is the profiling data offset (
0x1A2E8C0
)
Now the start of the data, note that the offset matches (before the data offset there are only zeroes, probably some room for some more data)

Now we have to apply the offset computed/printed when running the code so the addresses from gmon.out
match the addresses from the DLL
How to do that? it's pretty easy with some python script. The concept:
The aim is to add the address shift to the header addresses and all the addresses of the chunks, leaving the rest of the data unchanged.
The script, everything is hardcoded but you get the idea
import struct
with open("gmon.out","rb") as f: # the file produced by the run
contents = f.read()
start_address,end_address,data_offset = struct.unpack("<III",contents[:12])
profile_data = contents[data_offset:]
nb_records = len(profile_data)/12
records = []
for i in range(0,nb_records):
offset=i*12
extract = profile_data[offset:offset+12]
s,e,data = struct.unpack("<III",extract)
records.append((s,e,data))
# let's say 650000000 is the address that the program printed
# and 10120000 is the address that "nm" reports
shift = 0x65000000-0x10120000
with open("gmon2.out","wb") as f: # the file that will work with that run
f.write(struct.pack("<II",start_address+shift,end_address+shift))
f.write(contents[8:data_offset])
for s,e,d in records:
f.write(struct.pack("<III",s+shift,e+shift,d))
now:
gprof mydll.dll gmon2.out
Doing that allows to decode the gmon file against the DLL, since the addresses are now corrected to match the static addresses contained in the DLL.