0

I need to write some (safe to use) random junk assembly code for some obfuscation exercise on a C++ project. So far I could only find this:

#define JUNKCODE        \
__asm{push eax}            \
__asm{xor eax, eax}        \
__asm{setpo al}            \
__asm{push edx}            \
__asm{xor edx, eax}        \
__asm{sal edx, 2}        \
__asm{xchg eax, edx}    \
__asm{pop edx}            \
__asm{or eax, ecx}        \
__asm{pop eax}

Which doesn't seem to compile on Xcode for a 64bit project

How can I create multiple versions of junk assembly code that is safe to add to my project and it's 32-64bit compatible and crossplatform (Windows VS2013 and Mac Xcode8)?

  • 1
    64-bit VC++ doesn't support inline assembly at all. For those that do, the syntax is not portable. – Bo Persson Dec 18 '17 at 02:16
  • oh, I didn't know that. So what would be the best approach even using checks for each platform? – Santorini Dec 18 '17 at 02:36
  • 2
    Obfuscating C++ ... by adding junk code... sounds quite ridiculous itself. The optimized machine code is already cryptic enough to make it quite some challenge for RE, the junk code parts (especially if there are just few variants of them and they are not random generated) may in the end actually give a hint to seasoned RE engineer where are sort of points-of-importance marked with the junk block. It would made *some* sense to obfuscate the real code, but that's very tricky, usually 2-3 layers of interpretation/custom language works nicely w/o sacrificing source maintainability completely. – Ped7g Dec 18 '17 at 04:47

1 Answers1

2

To write asm that will assemble in either mode, use 32-bit operand size, and avoid push/pop: they're not available in 32-bit operand size for x86-64. Only the full-width register size for the mode, and 16-bit, are available for push/pop. (i.e. the 0x66 operand-size prefix works, but REX.W=0 doesn't in machine code, so push eax is not encodeable in 64-bit mode.)

Also avoid inc /dec if you want the same machine code to work in both modes (short encoding in 32-bit mode becomes a REX prefix in 64-bit mode). inc/dec are fine if you only need asm source that assembles.


For inline asm in C++:

Avoid modifying any memory or making any function calls. It's hard to do this safely, so if you just want junk code you can just play with registers. You can't even safely push in x86-64 code for the System V ABI, because the compiler might be using the red-zone below the stack, and there's no way to tell it you want to clobber it. I don't know if this applies to Clang with MSVC inline-asm syntax instead of GNU syntax where clobbers have to be explicitly declared.

Prefer GNU C inline asm syntax (with proper constraints to tell the compiler what registers you clobber).

MSVC itself doesn't support any inline asm syntax in 64-bit mode.

Xcode (clang) does support MSVC's syntax as well as GNU syntax, and it might support MSVC syntax even in 64-bit mode.

MSVC-style inline asm syntax basically sucks for wrapping short snippets (inputs have to bounce through memory because there is no syntax to ask for inputs in registers), and reportedly has been unreliable from version to version in MS's compiler. MS were probably looking for a reason or a way to drop it, and not supporting it for x86-64 makes sense. It would suck even more for accessing function args from inline asm if they left it unchanged, because they don't start in memory. Anyway, MSVC inline asm syntax is half dead, good riddance.

Anyway, if you really want to have some inline asm in 32-bit MSVC, then sure use that syntax. __asm __emit might do something in 64-bit MSVC, I think IACA headers use that, so you could maybe encode some NOPs manually if you just want your code to look weird.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    Inline asm (and `naked` functions) are basically also incompatible with the way Windows handles unwinding 64-bit functions. They have this specific function [prologue](https://learn.microsoft.com/en-us/cpp/build/prolog-and-epilog) and epilog format (the latter of which includes the concept of [emulating](https://stackoverflow.com/q/45336343/149138) the epilog as part of unwinding!) that they use to allow unwinding: more restrictions on the assembly in exchange for a simpler unwinding strategy (compared for e.g., to `eh_frame` or DWARF). So many types of inline assembly is problematic. – BeeOnRope Dec 19 '17 at 03:01
  • @BeeOnRope: Why does that constrain inline asm, though? `naked` sure, but inline asm doesn't stop the compiler from emitting whatever prologue/epilogue. – Peter Cordes Dec 19 '17 at 03:15
  • Partly because some of the things you might want to do in inline asm wouldn't be allowed anymore (e.g., manipulating `rsp`, manipulating the base pointer, which is now arbitrary rather than fixed as `ebp`) - at least without a bunch of work to interpret the inline asm and make the unwind data compatible. Partly because one of the main uses of inline asm was inside `naked` functions to do special handling of the prolog or epilog and that is no longer possible (i.e., no `naked` functions make inline asm less useful). Perhaps basically incompatible is too strong: could be "tougher to make work". – BeeOnRope Dec 19 '17 at 03:24
  • To clarify the base pointer idea: manipulating the base pointer isn't a common thing to want to do in asm, but what I mean is that in 64-bit Windows binaries _any_ pointer can be the base pointer and it _must not_ be modified (even if saved around the inline asm) since code couldn't be unwound reliably if the frame pointer was changed even temporarily. – BeeOnRope Dec 19 '17 at 03:29
  • @BeeOnRope: Oh, good point that one of the major uses for inline asm is exactly the kind of hacky stuff like "context switching" out from under the compiler which is unsafe in that ABI. I was mainly thinking of using it for performance reasons (which MSVC inline asm is bad for unless you write a whole loop). Interesting stuff about base/frame pointers. But if it's arbitrary, a compiler that didn't suck could choose a reg that you don't use as the frame pointer, or error. This might limit inlining if inlining two asm blocks that use different registers would leave no possible base pointer. – Peter Cordes Dec 19 '17 at 03:53
  • Exactly, but adding extra code to examine the asm (including understanding all the instructions that modify registers implicitly[1]) and making sure their code generation worked with this - i.e,. the register can't be chosen until it is clear the entire function doesn't contain any inline asm (which means after all inlining, etc). Of course it's possible to support, but it's nowhere close to "plug and play" (unlike say gcc asm which treats it as a black box) so they chose to drop it. [1] although I guess they have that for 32-bit code if they already selectively save regs around asm. – BeeOnRope Dec 19 '17 at 03:56