1

I have an .obj and I disassemble it (I do not have the original source file).

I modify the resulting assembly file by inserting my own assembly at certain instructions of interest, taking care to push/pop to the stack registers I use so I do not thrash the original content.

Why? Maybe I want to toggle a pin whenever a certain assembly instruction is executed (in real-time i.e. no debugger/JTAG).

Then, I want to assemble it back to .obj but since I've inserted my own assembly, the relative addresses for branches are incorrect now.

QUESTION

Is there an ARM tool that will auto-correct the relative addresses or do I have to do it manually as I insert my assembly?

This is for ARM Cortex M4 but I don't think it should matter.

Bob
  • 4,576
  • 7
  • 39
  • 107
  • 1
    use labels and let the assembler solve it even if the label is taking the 0x1000: add r1,r1,r2 and changing that to lab0x1000: add r1,r1,r2 and using lab0x1000 in whomever was branching there. – old_timer Nov 09 '17 at 22:10
  • @old_timer will that work even if the `.obj` is not compiled in debug mode? – Bob Nov 09 '17 at 22:11
  • if you write your own disassembler (more painful if thumb2 was used in this binary) you can then disassemble it with the labels. Or you can go into binutils and modify binutils so that when it outputs any pc relative address you output a label instead. – old_timer Nov 09 '17 at 22:11
  • @old_timer by `modify binutils`, you mean "modify its source code"? – Bob Nov 09 '17 at 22:12
  • yes of course, how else would you change the output of the disassembler? – old_timer Nov 10 '17 at 00:54
  • 1
    the title says decompiled you mean disassembled yes? – old_timer Nov 10 '17 at 00:55
  • 1
    Use a disassembler that inserts labels for branch targets. For x86, Agner Fog's `objconv` disassembler does this ([sample output](https://stackoverflow.com/a/33978857/224132)). Look for something similar for ARM, so you get position-independent asm output that can be re-assembled and linked. Hmm, it's necessarily not just branch instructions, BTW. Any code with static addresses as immediates will be affected. Or PC-relative loads if you insert code between them and their literal pool. – Peter Cordes Nov 10 '17 at 01:04
  • @PeterCordes put that as the answer; it does the hard part of my task (fixing branches) – Bob Nov 13 '17 at 21:50

1 Answers1

1

Use a disassembler that inserts labels for branch targets. For x86, Agner Fog's objconv disassembler does this (sample output).

Look for something similar for ARM, so you get asm output that can be re-assembled and linked, with addressing re-calculated. (IDK if any exist, but maybe you can hack that into one of the existing disassemblers.)

But beware that it's not just branch instructions that can be affected by inserting new code. Any code with static addresses as immediates will be affected by changes to data layout. On ARM, constants are often loaded with PC-relative loads from nearby literal pools, so those need to use labels as well.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • `static addresses as immediates`: which kinds of instructions would those be? – Bob Nov 13 '17 at 22:16
  • @Adrian: `ldi`, so in disassembly possibly `mov reg, #0x1234`. But in ARM, it's common to load constants from nearby literal pools (e.g. using PC-relative loads to get the static addresses) – Peter Cordes Nov 13 '17 at 22:23
  • [I see what you mean](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0473f/Bgbccbdi.html); this is getting complicated. This is also very relevant because the functions I need to `instrument` will contain very large 32-bit ints (the ***secret*** magic numbers for the seed-key authentication). – Bob Nov 13 '17 at 22:35
  • In the process of disassembling, is it possible to remove the usage of literal pools i.e. transform the assembly? – Bob Nov 13 '17 at 22:36
  • @Adrian: that would be a separate thing that you could do. No plain disassembler is going to do it, because then it wouldn't be showing you the instructions represented by the machine code anymore. IMO just use labels to address the literal pools so you can add code between them and the code that uses them. – Peter Cordes Nov 13 '17 at 22:52