A new instruction can be designed to be "legacy compatible" or it can not.
To the former class belong instructions like tzcnt
or xacquire
that have an encoding that produces valid instructions in older architecture: tzcnt
is encoded as
rep bsf
and xacquire
is just repne
.
The semantic is different of course.
To the second class belong the majority of new instructions, AVX being one popular example.
When the CPU encounters an invalid or reserved encoding it generates the #UD (for UnDefined) exception - that's interrupt number 6.
The Linux kernel set the IDT entry for #UD early in entry_64.S
:
idtentry invalid_op do_invalid_op has_error_code=0
the entry points to do_invalid_op
that is generated with a macro in traps.c
:
DO_ERROR(X86_TRAP_UD, SIGILL, "invalid opcode", invalid_op)
the macro DO_ERROR
generates a function that calls do_error_trap
in the same file (here).
do_error_trap
uses fill_trap_info
(in the same file, here) to create a siginfo_t
structure containing the Linux signal information:
case X86_TRAP_UD:
sicode = ILL_ILLOPN;
siaddr = uprobe_get_trap_addr(regs);
break;
from there the following calls happen:
do_trap
in traps.c
force_sig_info
in signal.c
specific_send_sig_info
in signal.c
that ultimately culminates in calling the signal handler for SIGILL
of the offending process.
The following program is a very simple example that generates an #UD
BITS 64
GLOBAL _start
SECTION .text
_start:
ud2
we can use strace
to check the signal received by running that program
--- SIGILL {si_signo=SIGILL, si_code=ILL_ILLOPN, si_addr=0x400080} ---
+++ killed by SIGILL +++
as expected.
As Cody Gray commented, libraries don't usually rely on SIGILL, instead they use a CPU dispatcher or check the presence of an instruction explicitly.