2

Lately, I've been writing some x86 assembly injections for the purpose of a game mod, but since most of my workflow has involved writing and assembling custom routines by hand, I've been looking to move towards a more robust solution.

Microsoft's inline assembler seems like a nice choice, but I've run into something of a limitation it seems to have.

Whenever I write an instruction that involves an immediate memory address (the game uses a fixed address space layout), the assembler silently converts it to an immediate value instead.

For example, in MASM:

mov ecx, 0xCCCCCCCC    => B9 CC CC CC CC
mov ecx, [0xCCCCCCCC]  => B9 CC CC CC CC*

* Should be: 8B 0D CC CC CC CC

In this case, both assembled instructions are loading ecx with the immediate value 0xCCCCCCCC, although the second one should be fetching the value from the immediate address 0xCCCCCCCC

Note that it's possible to use a named variable in this manner:

mov ecx, [myInt]

Which will assemble to an 8B 0D memory fetch instruction, but it also adds the operand to the module's relocation table and doesn't allow the specification of arbitrary addresses.

Trying to trick the assembler with something like

mov ecx, [myInt-myInt+0xCCCCCCCC]

Also results in the address being treated as an immediate value.

It is possible possible to go with:

mov ecx, 0xCCCCCCCC
mov ecx, [ecx]

Which will assemble properly and exhibit the correct behavior, but now I've bloated my injection size by 2 unnecessary bytes. Because I'm working under some rather tight spatial constraints, this isn't acceptable and I'd rather not use a code cave where I don't have to.

The funny thing is that something in C like:

register int x;
x = *(int*)(0xCCCCCCCC)

Happily compiles to

mov ecx, [0xCCCCCCCC]  => 8B 0D CC CC CC CC

It's a little odd to see a lower level language have more limitations placed on it than a higher level language. What I'm trying to do seems pretty reasonable to me, so does anyone know if MASM has some hidden way of using fixed immediate memory addresses when reading from memory?

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
Jason Lim
  • 171
  • 6
  • MASM doesn't do arithmetic between labels? Is that only inside an effective address, or does it also apply in an expression used as an immediate? I could imagine it not working if the labels were `extern`, so the difference was only available as link time, not assemble time. You should be able to use local labels, or any existing label (e.g. the name of the current function). – Peter Cordes Mar 10 '16 at 02:18
  • 1
    Sorry, that was my mistake. I'd been doing a bit of testing back and forth between MASM and VS's inline assembler. The inline assembler's more limited so it doesn't support label arithmetic, but MASM itself is fine. However, the effective address still ends up being treated as an immediate value. – Jason Lim Mar 10 '16 at 06:37
  • 1
    I think Ross Ridge gives a pretty good answer to a similar MASM question in this SO answer: http://stackoverflow.com/a/25130189/3857942 – Michael Petch Mar 10 '16 at 08:13
  • Possible duplicate of [Confusing brackets in MASM32](http://stackoverflow.com/questions/25129743/confusing-brackets-in-masm32) – Michael Petch Mar 10 '16 at 08:17
  • @MichaelPetch: Yes, that is very helpful. Thanks for the redirection. – Jason Lim Mar 10 '16 at 18:35

2 Answers2

2

I'm not sure if this works outside of the flat memory model, but I found that MASM distinguishes immediate addresses and immediate values as follows:

mov ecx, 0xCCCCCCCC      => B9 CC CC CC CC
mov ecx, [0xCCCCCCCC]    => B9 CC CC CC CC
mov ecx, ds:[0xCCCCCCCC] => 8B 0D CC CC CC CC

The first 2 instructions load ecx with the immediate value 0xCCCCCCCC. The last instruction loads ecx with the value at address 0xCCCCCCCC.

Jason Lim
  • 171
  • 6
1

In NASM syntax,

    mov     ecx, 0xCCCCCCCC                  ; B9 CC CC CC CC     mov ecx, imm32
    mov     ecx, [0xCCCCCCCC]                ; 8b 0d cc cc cc cc  mov r32, [disp32]
    mov     ecx, [_start-_start + 0xCCCCCCCC]; 8b 0d cc cc cc cc  same

Tested with nasm -felf and yasm -felf, on my GNU/Linux desktop.

I wonder if it's a bug that MASM assembles [0xCCCCCCCC] to an immediate instead of an effective address. Does it do the same when it's an operand to other instructions? e.g. is it an error with LEA?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    Thanks for the response. The same behavior is exhibited by any instruction that can read/write to an immediate memory location (cmp, inc, push, etc). For LEA, trying to assemble something like lea ecx, [0xCCCCCCCC] is an error (improper operand type). But including the segment prefix will result in the proper behavior. It's hard to tell if this was a bug or intended, but I'm leaning towards it being a bug. – Jason Lim Mar 10 '16 at 06:17