1

I have MSVC .asm code

_get_addr:
    call    _lb
_lb:
    pop     rax
    ret

with its header

PVOID _get_addr();

I would like to port it to GCC.
It's replacement in GCC should be something like that, right?

static __inline__ __attribute__((always_inline)) PVOID _get_addr() {
  PVOID pointer;
  __asm__(
        "call _lb"
        "_lb: pop rax; ret %[pointer]"
        : [pointer] "=r"(pointer)
  );

  return pointer;
}

EDIT:
FYI it is not my code so I'm not exactly sure what it's meant to do. I know only asm implementation and it header. So probably my inlining was not proper.

I just assumed that _get_addr() should return address of it's caller. I such case inlining makes sense right? But you are probably right and I shouldn't change that.

EDTI2
It is used in EDK2 custom component

PVOID GetImageAddress(void)
{
    PVOID Addr = _get_addr();
    PVOID Base = (PVOID)((UINT64)Addr & ~(DEFAULT_EDK_ALIGN - 1));

    // get current module base by address inside of it
    while (*(PUSHORT)Base != EFI_IMAGE_DOS_SIGNATURE)
    {
        Base = (PVOID)((PUCHAR)Base - DEFAULT_EDK_ALIGN);
    }

    return Base;
}
l00k
  • 1,525
  • 1
  • 19
  • 29
  • 3
    in *x64* you can even use `lea rax,@@0 ; @@0: ret`. also in *gcc* think exist [`__builtin_return_address`](https://gcc.gnu.org/onlinedocs/gcc/Return-Address.html) equal to [`_ReturnAddress`](https://learn.microsoft.com/en-us/cpp/intrinsics/returnaddress?view=msvc-160) – RbMm Sep 05 '21 at 21:24
  • I'm not quite clear on what this code is intended to do. It appears that you're trying to get the actual address of _lb? I suppose this approach would deal with relocations at load time. If that's the case, it's not clear what `always_inline` is doing there. Calls to _get_addr would return different values depending on where you call it from. I'm also not clear what you think `ret %[pointer]` does? Does that even assemble? How about something like `asm("call _lb%=\n" "_lb%=:\n\tpop %%rax" : "=a"(pointer));`? Using __builtin_return_address would be better if you can. – David Wohlferd Sep 05 '21 at 21:30
  • @DavidWohlferd: `__builtin_return_address` would give you the caller's address, so it only works if this is a non-inline function. But the OP wants something that can inline. Taking the address of a goto label seems better, or if you use asm, then RIP-relative LEA. – Peter Cordes Sep 05 '21 at 21:38
  • @PeterCordes - "OP wants something that can inline" - Does he 'want' something he can inline? Or did he use inline because he thought he needed it? Doesn't seem like the original msvc code would inline. IAC, it's hard to come up with an optimal solution without a better understanding of OP's purpose. Thus my statement re __builtin_return_address as being better "if you can." It's something OP could look at in the context of his requirements. – David Wohlferd Sep 05 '21 at 22:00
  • @DavidWohlferd: ok, fair point, without knowing how this is used, hard to say what the best option is. But `__builtin_return_address` in a function that isn't explicitly `__attribute__((noinline))` could give you an address in some parent function up the call chain, one step *past* as far as the compiler can see and choose to inline. So definitely something to be aware of if you go that route. https://godbolt.org/z/dxf7vT9bs shows how it inlines: if `foo` was called from a library callback function pointer, you'd be getting an address in the library, not in this executable / library. – Peter Cordes Sep 06 '21 at 00:02
  • @PeterCordes Fair point. I just worry that if it's inlined, that allows for the possibility that calls to get_addr from disparate points in the code could return a different answer (depending on the optimizer). I don't know if that's a good thing for OP or not, but I expect it's different than the previous behavior. Since I can't imaging a practical use for this code, I can't offer any solutions better than what you already posted. – David Wohlferd Sep 06 '21 at 01:46
  • @DavidWohlferd: Interesting point, yes, the original stand-alone asm function would return the same address to every caller. As would my answer if you made the function wrapped around the asm statement `__attribute__((noinline))` instead of always_inline. With `__builtin_return_address(0)`, you'd need two levels of functions, both of them noinline, to achieve that, because it's always going to be loading some actual return address (and then return, not call/pop), and not using a RIP-relative LEA. – Peter Cordes Sep 06 '21 at 04:59
  • 1
    "should return address of it's caller" - But that's not what the masm code does. It's returning the address of `_lb`, which just seems odd. How is the return value used? – David Wohlferd Sep 06 '21 at 05:56
  • 1
    If `_get_addr` is defined in the same library or executable as `GetImageAddress`, then it should be fine for it to inline and return an address inside `GetImageAddress`. Assuming `DEFAULT_EDK_ALIGN` is large-ish so rounding down to an alignment boundary gives you the same address either way. You're definitely throwing away all the low bits of the address, so position of the address within any given function is irrelevant. – Peter Cordes Sep 06 '21 at 06:05

1 Answers1

4

That MASM code is dumb, 64-bit mode doesn't need call to read its own address.
You should replace it with a RIP-relative LEA.
It's also not safe to push/pop in inline asm for x86-64 (unless you offset RSP by 128 first to skip over the red-zone); that would step on the red-zone below RSP in the x86-64 System V ABI, and there's no way to declare a clobber on that. Avoiding call avoids pushing anything.

Also since you don't want to make this a function, you can let the compiler choose the register. You're already using "=r", so use %0 in the template string to expand to the compiler's choice of register. Hard-coding RAX only happens to work if the compiler picks RAX for the "=r" output, otherwise disaster.

  void *current_RIP;
  // assuming the default -masm=att, not -masm=intel
  asm( "lea 0(%%rip), %0" : "=r"(current_RIP) );  // no inputs, no clobbers

(Your GNU C inline asm attempt was broken in other ways, too, e.g. ret %rax isn't a valid instruction. It makes no sense to put an operand on ret. And since this is inline asm, and you already popped the return address, you shouldn't use a ret instruction at all. The MASM code needed it because it's a function that returns to its caller, instead of falling out the end of an inline asm statement.)


Avoid inline asm

Or avoid inline asm entirely and just take a code address. Either of the function itself, like (void*)_get_addr, or of a goto label. (https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html)

static inline get_addr() {
  PVOID pointer = &&label;
  label:
  return pointer;
}

Note that _names are reserved at global scope for the implementation's own use. Unless this code is part of a GCC or clang header, that's a poor choice of name.

Godbolt shows how they compile with gcc -O2 with / without -fPIE. The label-value one will use mov eax, imm32 absolute addressing in a non-PIE Linux executable, because addresses are link-time constants with no ASLR. But if ASLR is possible (PIE), it will use RIP-relative LEA like the asm would.

The one weird thing is that since the label pointer is never dereferenced, the compiler just puts it somewhere in the function its inlined into. In this case at the top, even if there's other code in the function before taking the current address.

It's obviously not safe to jump to that address, so IDK what exactly you plan to do with the pointer, and how exact it needs to be.


As David Wohlferd pointed out in comments, the original stand-alone asm function always returns the same address (inside itself) to every caller. A function that can inline (or even is forced to inline) will return an address inside that function.

Or even inside a higher parent function if an optimizing compiler chooses to inline further. This requires compile-time inlining, unlike if you'd use __builtin_return_address(0), so at least it will always be an address in the same executable or library as the actual call site in the C source.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • I have updated question attaching a little bit more info. Meanwhile I will verifiy how your solution works. – l00k Sep 06 '21 at 05:51
  • @l00k: still no info about how this is used, only info about your misunderstanding of the original >.<. It doesn't return the *caller's* address; that would be `mov rax, [rsp]` / `ret`. It returns it's own address, of its own call target, probably naively ported from 32-bit code where call/pop was one of the only good ways to read EIP. (See the link in my answer.) – Peter Cordes Sep 06 '21 at 05:59
  • @l00k: Yeah, I can see that, but I'm saying that x86-64 MASM code probably came from a naive port of some older 32-bit code. Or else whoever wrote it didn't know x86-64 very well, and used a 64-bit version of a 32-bit idiom, instead of the normal 64-bit way. (I'm not talking about your attempt to port to GNU C inline asm. That's broken and wouldn't even assemble so I'd call it a "very naive *attempt* to port".) – Peter Cordes Sep 06 '21 at 06:09
  • 1
    Doing a search for DEFAULT_EDK_ALIGN led me to [this](http://blog.cr4.sh/2015/07/), where the variable names and comments are identical to OP, although the function name is much more revealing (BackdoorImageAddress). I think I'm done helping here. – David Wohlferd Sep 06 '21 at 06:25
  • 1
    @DavidWohlferd yeap that is why I didn't posted original name. You don't know what I'm trying to do so please don't judge me. FYI I'm trying to keep my system uptodate as Intel releases BIOS updates (including ucode update) with 2 month lag. Some proofs: https://www.win-raid.com/t9194f47-Business-request-Intel-NUC-microcode-update.html – l00k Sep 06 '21 at 06:50
  • 1
    @l00k: you don't need the BIOS / firmware to load microcode updates for most problems, especially ones that are just vulnerabilities to malicious code or stuff like MDS side channels; everything running in early boot is already basically trusted. The OS can load microcode once it's up, e.g. https://wiki.archlinux.org/title/microcode – Peter Cordes Sep 06 '21 at 07:54
  • @peter-cordes: Sadly SGX (and its RA) requires updated ucode in BIOS - not that one loaded by system. That is why I'm trying to implement autoupdating ucode (via filesystem) based on that Backdoor code. – l00k Sep 06 '21 at 09:13