I have a collection of binaries in the format of ELF. I am trying to retrieve the source code and compare it with the assembly so that I could have better understanding about the compiling process.
Now I could de-assemble the ELF based on the great solution here, which uses the following command
objdump -dj .text <binary>
and outputs something that looks like
00000000004035ed <version_etc>:
4035ed: 48 81 ec d8 00 00 00 sub $0xd8,%rsp
4035f4: 4c 89 44 24 40 mov %r8,0x40(%rsp)
4035f9: 4c 89 4c 24 48 mov %r9,0x48(%rsp)
4035fe: 84 c0 test %al,%al
403600: 74 37 je 403639 <version_etc+0x4c>
403602: 0f 29 44 24 50 movaps %xmm0,0x50(%rsp)
403607: 0f 29 4c 24 60 movaps %xmm1,0x60(%rsp)
40360c: 0f 29 54 24 70 movaps %xmm2,0x70(%rsp)
403611: 0f 29 9c 24 80 00 00 movaps %xmm3,0x80(%rsp)
...
So I will try to retrieve version_etc.c
file in the source tree and compare it with this assembly snippet.
I know there are some correspondence between the first two columns and the last two columns. However, I am not quite interested in those first two columns.
I am wondering if there is any tool that could help me extract the last two columns as a string and pair it with the header (in the example, version_etc
).
I know I could simply write a script that uses regular expressions or similar to do this, but that would be error-prone (corner cases, etc.), it would be great if I could use some more principled way to do this extraction.