14

While trying to make some old code work again (https://github.com/chaos4ever/chaos/blob/master/libraries/system/system_calls.h#L387, FWIW) I discovered that some of the semantics of gcc seem to have changed in a quite subtle but still dangerous way during the latest 10-15 years... :P

The code used to work well with older versions of gcc, like 2.95. Anyway, here is the code:

static inline return_type system_call_service_get(const char *protocol_name, service_parameter_type *service_parameter,
    tag_type *identification)
{
    return_type return_value;

    asm volatile("pushl %2\n"
                 "pushl %3\n"
                 "pushl %4\n"
                 "lcall %5, $0"
                 : "=a" (return_value),
                   "=g" (*service_parameter)
                 : "g" (identification),
                   "g" (service_parameter),
                   "g" (protocol_name),
                   "n" (SYSTEM_CALL_SERVICE_GET << 3));

    return return_value;
}

The problem with the code above is that gcc (4.7 in my case) will compile this to the following asm code (AT&T syntax):

# 392 "../system/system_calls.h" 1
pushl 68(%esp)  # This pointer (%esp + 0x68) is valid when the inline asm is entered.
pushl %eax
pushl 48(%esp)  # ...but this one is not (%esp + 0x48), since two dwords have now been pushed onto the stack, so %esp is not what the compiler expects it to be
lcall $456, $0

# Restoration of %esp at this point is done in the called method (i.e. lret $12)

The problem: The variables (identification and protocol_name) are on the stack in the calling context. So gcc (with optimizations turned out, unsure if it matters) will just get the values from there and hand it over to the inline asm section. But since I'm pushing stuff on the stack, the offsets that gcc calculate will be off by 8 in the third call (pushl 48(%esp)). :)

This took me a long time to figure out, it wasn't all obvious to me at first.

The easiest way around this is of course to use the r input constraint, to ensure that the value is in a register instead. But is there another, better way? One obvious way would of course be to rewrite the whole system call interface to not push stuff on the stack in the first place (and use registers instead, like e.g. Linux), but that's not a refactoring I feel like doing tonight...

Is there any way to tell gcc inline asm that "the stack is volatile"? How have you guys been handling stuff like this in the past?


Update later the same evening: I did found a relevant gcc ML thread (https://gcc.gnu.org/ml/gcc-help/2011-06/msg00206.html) but it didn't seem to help. It seems like specifying %esp in the clobber list should make it do offsets from %ebp instead, but it doesn't work and I suspect the -O2 -fomit-frame-pointer has an effect here. I have both of these flags enabled.

Celelibi
  • 1,431
  • 1
  • 12
  • 30
Per Lundberg
  • 3,837
  • 1
  • 36
  • 46
  • Iirc adding "cc" and/or "memory" to the clobber list does this. Sometimes adding volatile to the asm() will prevent the compiler from over-optimizing. – technosaurus Mar 16 '15 at 21:31
  • How about doing `push 8+%4`? – David Wohlferd Mar 17 '15 at 02:00
  • 2
    By the way: [%rsp in clobber list is silently ignored](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52813). – David Wohlferd Mar 17 '15 at 04:54
  • 3
    gcc doesn't inspect the asm statement. Consider: if the stack pointer is clobbered - how can it be restored in general? Where would it save the value to? You have to correct the stack yourself by adjusting `%esp` explicitly or unwinding with corresponding `pop` instructions. – Brett Hale Mar 17 '15 at 09:07
  • 1
    Thanks @BrettHale, but you missed my point. The issue here was *not* with %esp being restored, but rather it being modified in the midst of the inline assembly, which gcc is of course unaware of. The restoration of the stack pointer is taking place elsewhere. Since this was obviously not clear enough (since you misunderstood it), I've updated the post so that this is clearly stated. – Per Lundberg Mar 17 '15 at 22:09
  • @DavidWohlferd - thanks. `push 8+%4` may work, but it feels extremely ugly and intrinsically dangerous. What if the value coming in is *not* on the stack? :) Then it would break entirely. So the right way must be to somehow make `gcc` aware of the fact that it can't rely on `%esp` being constant within the asm block, or use another constraint like `ri`. Or disable `-fomit-frame-pointer`. – Per Lundberg Mar 17 '15 at 22:10
  • I have not read your question carefully but have you tried using an [early clobber modifier](https://stackoverflow.com/questions/26567746/unexpected-gcc-inline-asm-behaviour-clobbered-variable-overwritten). Try changing `"=a"` to `"=&a"` and `"=g"` to `"=&g"`. – Z boson Dec 03 '15 at 09:36

1 Answers1

3

What works and what doesn't:

  1. I tried omitting -fomit-frame-pointer. No effect whatsoever. I included %esp, esp and sp in the list of clobbers.

  2. I tried omitting -fomit-frame-pointer and -O3. This actually produces code that works, since it relies on %ebp rather than %esp.

    pushl 16(%ebp)
    pushl 12(%ebp)
    pushl 8(%ebp)
    lcall $456, $0
    
  3. I tried with just having -O3 and not -fomit-frame-pointer specified in my command line. Creates bad, broken code (relies on %esp being constant within the whole assembly block, i.e. no stack frame).

  4. I tried with skipping -fomit-frame-pointer and just using -O2. Broken code, no stack frame.

  5. I tried with just using -O1. Broken code, no stack frame.

  6. I tried adding cc as clobber. No can do, doesn't make any difference whatsoever.

  7. I tried changing the input constraints to ri, giving the input & output code below. This of course works but is slightly less elegant than I had hoped. Then again, perfect is the enemy of good so maybe I will have to live with this for now.

Input C code:

static inline return_type system_call_service_get(const char *protocol_name, service_parameter_type *service_parameter,
    tag_type *identification)
{
    return_type return_value;

    asm volatile("pushl %2\n"
                 "pushl %3\n"
                 "pushl %4\n"
                 "lcall %5, $0"
                 : "=a" (return_value),
                   "=g" (*service_parameter)
                 : "ri" (identification),
                   "ri" (service_parameter),
                   "ri" (protocol_name),
                   "n" (SYSTEM_CALL_SERVICE_GET << 3));

    return return_value;
}

Output asm code. As can be seen, using registers instead which should always be safe (but maybe somewhat less performant since the compiler has to move stuff around):

#APP
# 392 "../system/system_calls.h" 1
pushl %esi
pushl %eax
pushl %ebx
lcall $456, $0
Per Lundberg
  • 3,837
  • 1
  • 36
  • 46
  • 2
    gcc has enabled `-fomit-frame-pointer` as part of `-O1` for several years now, even for 32-bit code, so you need to explicitly specify the negative form `-fno-omit-frame-pointer` to turn it off with optimization enabled. (But yeah, asking for all the inputs in registers or immediates is probably a better idea.) There might also be a function attribute that lets you force a frame pointer on a per-function basis. – Peter Cordes Sep 14 '18 at 19:15