1

I caught a problem on re-asselembling the disassembly.
I want byte pattern of re-assembled binary to be same with original binary.
However, It wasn't. I'll show you an example.

(I know encoding problem regarding displacement can be handled using Here.
But this problem is about immediate value, not displacement)

1.disassembly of binary_A was :

83 2f 55                subl   $0x55,(%edi)
81 2f 55 00 00 00       subl   $0x55,(%edi)



2.parsed 1 and made A.s file:

subl   $0x55,(%edi)
subl   $0x55,(%edi)



3.re-assembled A.s.

83 2f 55                subl   $0x55,(%edi)
83 2f 55                subl   $0x55,(%edi)


As you see, byte pattern of re-assembled binary and original binary is different!


Actually, in my opinion, disassembled code in step 1 shouldn't be the same.
Because encoding of immediate value(0x55)is different each other!(55 00 00 00 and 55).

How can I enforce the assembler to emit the exact machine code that I want?
Especially, I want subl $0x55,(%edi) to be assembled as 81 2f 55 00 00 00.

Is there any way to do this?

Jiwon
  • 1,074
  • 1
  • 11
  • 27
  • 1
    The trivial way would be something like `db 81h, 2fh, 55h, 0, 0, 0`, but probably it's not what you want. – Matteo Italia Jul 29 '18 at 18:10
  • 2
    Besides Matteo's correct response, about the only way to do it is to use a different assembler that does support this feature (NASM will) – Michael Petch Jul 29 '18 at 18:44
  • 1
    @MatteoItalia: In GAS, it would be `.byte 0x81, 0x2f, ...`. `db` is NASM / MASM. But yeah MichaelPetch is right: GAS doesn't have control over the width of immediates. With up-to-date binutils, you could use `{disp8}` / `{disp32}` prefixes ([What methods can be used to efficiently extend instruction length on modern x86?](https://stackoverflow.com/q/48046814)) to make the instruction almost the same length by using a disp32, but that's 4 bytes longer rather than 3 because the original had no displacement. – Peter Cordes Jul 29 '18 at 19:05
  • 2
    AFAIK, none of the mainstream x86 assemblers are designed to give total control over encoding choices. **Normally people don't try to disassemble / reassemble the whole binary, just modify some bytes in an existing binary and maybe add a new section.** Maybe try Agner Fog's `objconv` disassembler (http://agner.org/optimize/) which can put labels on branch targets, so reassembling with different instruction lengths might not break everything. But instruction-lengths aren't the only problem; section/segment ordering, alignment, and attributes (read-only / read-write) could be an issue. – Peter Cordes Jul 29 '18 at 19:08
  • 1
    Forgot to mention, if you don't restrict this to GAS, then it's yet another duplicate of [What methods can be used to efficiently extend instruction length on modern x86?](https://stackoverflow.com/q/48046814). In NASM syntax, `sub [edi], strict dword 0x55`, which is easier than what @MatteoItalia suggested if you're going to use NASM syntax. :P – Peter Cordes Jul 29 '18 at 22:31
  • Oh `NASM` support the syntax :D Your answer was very helpful for me. thanks. @PeterCordes @MichaelPetch – Jiwon Aug 03 '18 at 13:39

0 Answers0