3

I'm trying to create a dispatch table which changes the location of some instruction in another address which is allocated by AllocateMemoryOnRemoteProcess.

One of the problems that I encountered was almost all of Calls and all kind of Jumps are near and relative and as long as I load the assemblies in new location, then these instructions won't work.

As I know I should convert these instructions to far jump or far call one of the solutions that I saw during my googling was using push and ret like :

push 0xdeadbeef
ret

or someone suggests using registers for absolute addressing like :

mov %eax,0xdeadbeef
jmp %eax

These solutions won't work in my case because as long as I'm in a function routine, changing the stack state or in the second case changing a register like %eax causes failure.

Someone in this question wrote :

  • call far (with opcode 9A) jumps to an absolute segment and offset. ie, it's like setting CS and ?IP at once.

So it seems I should use opcode with 9A for far calls, but this just works for the calls and I have no idea about converting all kinds of Jumps with this method!

I regularly use objdump to disassemble a binary, then use clang as the assembler by using the following command :

clang -c MyAsm.asm -m32

But when I assemble with the above command then the result is relative.

For example when MyAsm.asm is :

call   0x402af2

The result of objdump is :

    MyAsm.o:    file format Mach-O 32-bit i386

Disassembly of section __TEXT,__text:
__text:
       0:   e8 ed 2a 40 00  calll   4205293 <__text+0x402AF2>

These results are relative.

So my questions are :

  1. How can I assemble far calls or far jumps (j* instructions) with clang or any other tools (which of course, work for both 80x86 and Amd64 structures)?
  2. Is there any other instruction like calls or jumps that use relative addressing, so I should reassemble in order to avoid the problem?
Mohammad Sina Karvandi
  • 1,064
  • 3
  • 25
  • 44
  • 2
    I'm almost 100% sure you *don't* want a FAR call/jump. IIRC in AT&T syntax far calls/jumps are `lcall`/`ljmp`, in Intel syntax just add the `FAR` qualifier. – Margaret Bloom Jan 29 '18 at 10:07
  • 2
    If I understand correctly, your problem has nothing to do with far call, you don't need to change to another segment or anything involving GDT table. You just want to use an absolute address instead of an address relative to current eip. But how can this be a problem ? You want to modify it after linking ? – llllllllll Jan 29 '18 at 10:11
  • @liliscent yes, as I described I wanna change another process memory so I want to modify after linking because J* and Calls are not in the previous locations. – Mohammad Sina Karvandi Jan 29 '18 at 10:13
  • And `push $deadbeef` `ret` won't cause failure, they are almost literally equivalent to a jump. – llllllllll Jan 29 '18 at 10:14
  • @liliscent Yeah you're right but in a call routine I can't change the stack state and another problem with this method is ret is not equivalent to any conditional jumps ! – Mohammad Sina Karvandi Jan 29 '18 at 10:18
  • I think this approach can be very ugly and dangerous, if you want to dynamically load some routine, consider `dlopen`. – llllllllll Jan 29 '18 at 10:18
  • @MargaretBloom What exactly I should change ?! I use lcall but it causes the following error in clang : MyAsm.asm:1:1: error: unknown use of instruction mnemonic without a size suffix lcall 0x402af2 – Mohammad Sina Karvandi Jan 29 '18 at 10:19
  • Apparently, that's the syntax for indirect calls. [Take a look at this page](http://csiflabs.cs.ucdavis.edu/~ssdavis/50/att-syntax.htm), maybe it can help. I'm not well versed in AT&T syntax, need a handful of minutes to set up a test. – Margaret Bloom Jan 29 '18 at 10:24
  • Apparently, that's the syntax for indirect calls. [Take a look at this page](http://csiflabs.cs.ucdavis.edu/~ssdavis/50/att-syntax.htm), maybe it can help. I'm not well versed in AT&T syntax, need a handful of minutes to set up a test. – Margaret Bloom Jan 29 '18 at 10:24
  • The `lcall` worked for me, but only if the operands were immediate (e.g. `lcall $0x08, $0x12345678`) and `--32` where passed to GAS. – Margaret Bloom Jan 29 '18 at 10:47
  • I advise you to use `movabs $addr,%rax ; jmp *%eax` as all other tricks have a significant speed penalty. `push ... ; ret` trashes the return predictor and `lcall` is a slow micro-coded instruction which is additionally not available in long mode. – fuz Jan 29 '18 at 10:58
  • @MargaretBloom yeah , your code assembles without error but I just have an absolute address, what is $0x08 ?! – Mohammad Sina Karvandi Jan 29 '18 at 11:02
  • @ᔕIᑎᗩKᗩᖇᐯᗩᑎᗪI I guess it's the default `%cs` selector value for long mode processes on some operating system – not something you should rely on staying the same. – fuz Jan 29 '18 at 11:03
  • @MargaretBloom Still error :( . MyAsm.asm:2:1: error: unknown use of instruction mnemonic without a size suffix lcall %cs, $0x2B1004 – Mohammad Sina Karvandi Jan 29 '18 at 11:06
  • You don't need a far call. Not sure what you want to do, but far calls are unusable in protected/long mode within one of the mainstream OS. If you want to code an absolute address, in the worst, emit the opcodes manually (the `db` directive in NASM). – Margaret Bloom Jan 29 '18 at 11:32
  • Why can't you allocate memory within 2GB of the code you want to modify? Then you can just update the relative displacements in the existing instructions instead of changing them to longer absolute indirect jumps. – Peter Cordes Jan 29 '18 at 12:42
  • @PeterCordes actually I can't predict the return address of AllocateMemoryOnRemoteProcess ! – Mohammad Sina Karvandi Jan 29 '18 at 13:02
  • 1
    You don't need to predict it, run it and then use the return value to calculate relative offsets. But I think you mean you mean you can't *control* it or hint it to get it near the target code. (Like you can with `mmap(suggested_target, ...)` without MAP_FIXED). That's unfortunate. – Peter Cordes Jan 29 '18 at 13:05

1 Answers1

8

If you can spare a register, I advise you to use

    movabs $addr,%rax
    jmp *%rax

or, if you can ensure that the address is within the first 2 GB of address space,

    mov $addr,%eax
    jmp *%eax

I strongly advise you against using

    push $addr
    ret

as this trashes the return prediction, making the next few function returns slower than necessary. Far jumps and calls (ljmp and lcall) are a red herring. While they could technically be used, they won't help you achieve your goal and are actually meant for a different purpose (changing cs) and are implemented as slow, micro-coded instructions on modern processors.

If you cannot spare a register, you can use this sort of trick instead:

    jmp *0f(%rip)
    ...
0:  .quad addr

The second line can be anywhere in the program and should be in the data segment for ideal performance. However, if needed, it can also be right after the jump instruction.

This should just work and in addition doesn't require you to use an extra register. It is slower than using a register though.

Note that conditional jumps strictly require the jump target to be immediate. If you want to do a conditional jump to an absolute address, use an idiom like this:

    # for jz addr
    jnz 1f
    jmp *0f(%rip)
0:  .quad addr
1:  ...

Special considerations for 16 and 32 bit mode

Note that in 16 and 32 bit mode, there is no rip-relative addressing mode. So you'll have to use an absolute address and write

    jmp *0f
0:  .long addr

instead. However, that kind of defeats the purpose as if you could use an absolute addressing mode to reach 0f, you could also just use a relative addressing mode to reach addr. So it seems like you'll have to resort to a push + ret sequence, even if it is slow.

In 16 bit modes, most likely using a far jump is fine. If not, the push + ret sequence is idiomatic (processors of that vintage did not have return prediction).

fuz
  • 88,405
  • 25
  • 200
  • 352
  • Your 4th assembly code (Without spare a register) works but when I change jmp with a conditional jump then (e.g je) then it gives me the following error : MyAsm.asm:1:8: error: invalid operand for instruction je *0xffff(%eip) – Mohammad Sina Karvandi Jan 29 '18 at 11:42
  • Does the 4th assembly code supports conditional jumps ?! – Mohammad Sina Karvandi Jan 29 '18 at 11:42
  • 2
    @ᔕIᑎᗩKᗩᖇᐯᗩᑎᗪI No it doesn't. You can work around this using a “jump around” idiom. Let me update my answer to explain this. – fuz Jan 29 '18 at 11:43
  • 1
    This Q&A came up again recently, as [How Do I Convert This ATT assembly to Intel Syntax? Jump to a non-relative address without using registers](https://stackoverflow.com/q/72135694) asked about wrong code from a blog written by the OP of this question. If you don't need this to be "self-contained", normally you'd want to put the `.quad addr` with other constant data, somewhere within +-2GiB of the instruction referencing it, usually not right after. (That will take space in the L1d cache for this line, as well as L1i, and iTLB + dTLB.) – Peter Cordes May 08 '22 at 10:13
  • In 32-bit mode you don't need this trick in the first place for the OP's use case; you can reach anywhere from anywhere, so you just need to properly encode `jmp target`, i.e. with a rel32 = `target - (jmp_addr+5)`. They do know the addresses inside the destination process; the problem was maybe being farther than +-2GiB away in 64-bit mode. (BTW, in x86-64 SysV at least, r11 is a pure scratch register, never used for passing or returning anything, so trampolines and wrapper functions can use it without disturbing AL = number of XMM args for a variadic function or anything else.) – Peter Cordes May 08 '22 at 11:59
  • @PeterCordes Not if you have PIC and want to call into an absolute address. But then, the `jmp *0f` trick doesn't work either hm... – fuz May 08 '22 at 12:25
  • Yeah, exactly. They say they're using `AllocateMemoryOnRemoteProcess`. IDK when they're actually assembling these snippets or where they're copying them, but the smart thing would be to manually encode a `jmp` instruction with the correct `rel32` for the known source and target addresses inside the other process. That means you can't just poke(?) the bytes for a fixed recipe you got from an assembler to jump to an absolute address from anywhere, but there is no such recipe for 32-bit mode except using the stack or a register. And certainly most efficient is a `jmp rel32`. – Peter Cordes May 08 '22 at 13:00
  • BTW, we have a Q&A about that already: [Call an absolute pointer in x86 machine code](https://stackoverflow.com/q/19552158) with suggestions for JITs to try to get memory allocated near enough for rel32 encodings to work, and details on how to do it in assembly or in machine code. GAS and NASM do assemble `call 0x12345678` to a call to that absolute address, with a relocation to make it work for wherever it ends up linked. – Peter Cordes May 08 '22 at 13:02