The ud2 instruction (2 bytes) is a rather compact way to raise SIGILL on POSIX platforms. Are there any other similarly tight ways of raising a hardware exception on x86_64?
Asked
Active
Viewed 145 times
0
-
2`ud2` is an x86_64 instruction. Note that it is unrelated to POSIX and POSIX does not actually provide a `__builtin_trap` function. – fuz May 23 '22 at 12:29
-
3You can use an instruction that is invalid in 64 bit mode, e.g. `AAA` opcode `0x37` will produce #UD. – Jester May 23 '22 at 12:36
-
@fuz What I meant is that POSIX kernel turns it into a SIGILL signal but that's not super important. I'm simply curious if another exception (SIGFPE? SIGSEGV? or another SIGILL that is distinguishable from the ud2-caused SIGILL) can be raised with similarly compact code (well, other than SIGTRAP, which can be raised with a one-byte INT3, but isn't suitable for my purpose). – Petr Skocik May 23 '22 at 12:37
-
@Jester Thanks! That looks promising. Any way to force it onto the gnu assembler, which is rejecting it with `'aaa' is not supported in 64-bit mode`? – Petr Skocik May 23 '22 at 12:39
-
2Use `.byte 0x37`. You can also use privileged instructions for #GP. – Jester May 23 '22 at 12:44
-
4@PSkocik You can reliably raise a `SIGFPE` using `DIV AH`. – fuz May 23 '22 at 13:27
-
@PSkocik Also note that there is no such thing as a POSIX kernel. It's an implementation detail of various UNIX-like operating systems to generate a SIGILL for undefined instructions. No standard exists to say what x86 hardware exceptions should be mapped to. There's only historical practice. – fuz May 23 '22 at 13:29
-
1As discussed in [Looking for a \_one byte\_ invalid opcode with x86](https://stackoverflow.com/q/72334461) , one-byte opcodes that currently #UD are *not* guaranteed to do so on future CPUs; only `ud2` is future-proof on paper. Hopefully some future extension will use that 64-bit-only coding space for something, instead of only longer encodings that can also be valid in 32-bit mode like VEX and EVEX prefixes. But on all current x86-64 CPUs, yes, things like `0x37` (32-bit mode AAA) will `#UD`. – Peter Cordes May 23 '22 at 14:52
-
1@fuz: The wording of the POSIX standard does fairly strongly imply that if a kernel is going to deliver a signal as a result of a hardware exception on an arithmetic instruction, the signal should be SIGFPE. https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html lists `si_code` values like integer divide, integer overflow, and various unmaked FP exception types. (related: [On which platforms does integer divide by zero trigger a floating point exception?](https://stackoverflow.com/a/37266507)). – Peter Cordes May 23 '22 at 14:57
-
1I've seen people say that SIGBUS is more appropriate for misaligned `movdqa`, too, unlike Linux's SIGSEGV. IIRC MacOS raises SIGBUS on something like `movaps xmm0, [-1]`. (More compact if you had a known register value that's either odd or even, so you could offset it by an odd disp8 if needed, or a RIP-relative to reach an odd address.) – Peter Cordes May 23 '22 at 15:01
-
1Can you clarify more what kind of "hardware exception" you mean? Are you specifically looking for an instruction that will raise a different signal than `SIGILL`, or is there some other criteria? `hlt` is a popular way to get SIGSEGV in one byte. – Nate Eldredge May 24 '22 at 00:28
-
@NateEldredge Thanks. It's basically already well-answered in the comments and the linked question. Was looking for a compact way of raising *some* signal (perhaps even another SIGILL, as long as it's distinguishable in a signal action from the ud2 SIGILL, which it will be if it's caused by a different instruction). ` – Petr Skocik May 24 '22 at 05:53
-
You'd like ARMv8. Their permanently undefined instruction `udf` consists of any 32-bit word of which the high 16 bits are zero, and the low 16 bits can be anything. So you effectively get 65536 easy-to-remember undefined instructions. – Nate Eldredge May 24 '22 at 06:15
-
@fuz Ran into another one: lock std -- 2 byte felixcloutier.com/x86/std (.word 0x3c0f), generates #UD regardless of mode. There's probably more of those. But so far, I've really liked the one you suggested. – Petr Skocik Jun 04 '22 at 11:56
-
1@PSkocik The classic undefined instruction is `ff ff` but today people prefer `ud1` and `ud2` (the difference is that `ud1` takes modr/m operands, `ud2` does not). I would not build on `lock std` remaining undefined in the foreseeable future. – fuz Jun 04 '22 at 12:19
-
@fuz What do you think about using single byte instructions that are disallowed in 64-bit mode (into/pusha/popa/sahf/lahf)? Those should remain undefined (in 64-bite mode) in the future, correct? – Petr Skocik Apr 13 '23 at 09:51
-
1@PSkocik Not a good idea. They may become defined in the future. sahf and lahf already were with 2nd gen amd64. arpl, lds, and les were reused for the VEX and EVEX encoding planes. – fuz Apr 13 '23 at 10:13
-
@fuz Thank you very much for being most helpful. I'm using this in system language runtime (same codegen for kernels and userland welcome though not essential) for a frequently checked for (so smaller codesize than a function call welcome) but very rarely arising (so I don't mind the trap time overhead) catchable (=can't use ud2 which compilers already generate for hard panics) condition and I've reduced it to just one such condition so `div %ah;` suffices. (Alternatively switching to `int3` might also do I think as long if debuggers don't get confused by it). – Petr Skocik Apr 13 '23 at 11:21
-
@fuz In any case, I'm glad to have learned interesting stuff about x86-64 and I will accept your answer if you feel like posting one. Thanks! ;-) – Petr Skocik Apr 13 '23 at 11:21
-
musl libc uses one-byte `hlt` which results in general protection fault if attempted from ring3, for which Linux delivers SIGSEGV – amonakov Apr 24 '23 at 16:57