SYSRET vs SYSRETQ distinction and compatibility mode

Question

I'm going with the Intel implementation of the SYSCALL/SYSRET instructions. If I'm reading their documentation correctly, unlike AMD's implmenetation of SYSCALL, Intel's version can be called only from a 64-bit long mode. Is that correct?

But then if I read Intel's documentation for the accompanying instruction SYSRET, it comes in two flavors:

SYSRET 0F 07 = "Return to compatibility mode from fast system call"
SYSRETQ 48 0F 07 = "Return to 64-bit mode from fast system call"

So I'm just trying to understand it, at which point will SYSRET (0F 07) be used to return to compatibility mode if SYSCALL cannot be called from it?

Compatibility mode is a subset of 64-bit long mode. – Ross Ridge Aug 02 '18 at 20:21 — Ross Ridge, Aug 02 '18 at 20:21

Ross Ridge · Answer 1 · 2018-08-02T21:37:42.430

4

While Intel's version of SYSCALL can't be used in compatibility mode, the SYSRET instruction can be used from 64-bit mode to "return" to compatibility mode. The SYSRET instruction doesn't require a previous SYSCALL instruction to work, jut like the RET instruction doesn't require a previous CALL instruction.

The Intel 64 and IA-32 Architectures Software Developer's Manual documents the operation of the SYSRET instruction as follows:

IF (CS.L ≠ 1 ) or (IA32_EFER.LMA ≠ 1) or (IA32_EFER.SCE ≠ 1) (* Not in 64-Bit Mode or SYSCALL/SYSRET not enabled in IA32_EFER *)
    THEN #UD; FI;
IF (CPL ≠ 0) OR (RCX is not canonical) THEN #GP(0); FI;
IF (operand size is 64-bit)
    THEN (* Return to 64-Bit Mode *)
        RIP ← RCX;
    ELSE (* Return to Compatibility Mode *)
        RIP ← ECX;
FI;
RFLAGS ← (R11 & 3C7FD7H) | 2; (* Clear RF, VM, reserved bits; set bit 2 *)

IF (operand size is 64-bit)
    THEN CS.Selector ← IA32_STAR[63:48]+16;
    ELSE CS.Selector ← IA32_STAR[63:48];
FI;
CS.Selector ← CS.Selector OR 3; (* RPL forced to 3 *)
(* Set rest of CS to a fixed value *)
CS.Base> ← 0; (* Flat segment *)
CS.Limit ← FFFFFH; (* With 4-KByte granularity, implies a 4-GByte limit *)
CS.Type ← 11; (* Execute/read code, accessed *)
CS.S ← 1;
CS.DPL ← 3;
CS.P ← 1;
IF (operand size is> 64-bit)
    THEN (* Return to 64-Bit Mode *)
        CS.L ← 1; (* 64-bit code segment *)
        CS.D ← 0; (* Required if CS.L = 1 *)
    ELSE (* Return to Compatibility Mode *)
        CS.L ← 0; (* Compatibility mode *)
        CS.D ← 1; (* 32-bit code segment *)
FI;
CS.G ← 1; (* 4-KByte granularity *)
CPL ← 3;
[...]

As you can see there are differences between the operation depending on the operand size. Notably with a 32-bit operand size the the CS.L and CS.D flags set to 0 and 1 meaning the CPU begins executing instructions at the address given by ECX in 32-bit compatibility mode. It does this regardless how the kernel (privilege level 0) was entered.

While on Intel CPUs the 32-bit operand size version of SYSRET can't be used in the way that would be the most obvious, to resume execution of a 32-bit compatibility mode task that used SYSCALL to enter the kernel, it could still have other uses. Like starting the execution of a new 32-bit task or maybe even resuming one that entered the kernel by some other means.

edited Aug 02 '18 at 21:37

answered Aug 02 '18 at 21:17

Ross Ridge

38,414
7
81
112

Btw, how does a 32-bit user-mode code enter 64-bit kernel mode? (Let's take Windows for this example.) I'm assuming via a far call to an inter-privilege call-gate, right. I guess in that case, the kernel routine can adjust the stack and use `sysret`. Although this whole "adjusting" code may totally negate the efficiency gained by using `sysret` vs just using the plain old `retf`. Anyway, I need to re into Wow64 on Windows to see what they're doing there. Thanks. – MikeF Aug 02 '18 at 21:28
@MikeF Either through an interrupt (software or hardware), exception, far call/jmp through a gate, or SYSENTER instruction. Windows uses SYSENTER for system calls, and doesn't use far call/jmp through gates for anything as far as I know. – Ross Ridge Aug 02 '18 at 21:36
@MikeF 64-bit Windows processes use `syscall` to enter the kernel and 32-bit processes on 64-bit Windows (WoW64) first perform a far call in user mode to a 64-bit code segment and then `syscall` is used. So the return to compat mode capability of `sysret` is not used on Windows; it always returns to 64-bit mode. – Hadi Brais Aug 02 '18 at 22:00
I think Linux too does not use `sysret` to return to compat mode. This could be because it's slower than other methods (like you suggested). But then why would Intel support `sysret` to return to compat mode if `syscall` cannot be used in compat mode. This only makes sense to me if it is faster than other methods, which I doubt. – Hadi Brais Aug 02 '18 at 22:24
@HadiBrais: Yeah, I thought too that Windows uses `syscall` for the kernel mode calls. So going back to my original question, it sounds weird to support `sysret` into compatibility mode when `syscall` from it is not supported. (I know that AMD implementation of `syscall` supports it both ways, so there it will be more logical to have that implementation of `sysret`.) – MikeF Aug 02 '18 at 22:39
1

@HadiBrais Linux has the `__kernel_vsyscall` function in vdso.so which can use INT $0x80, SYSCALL or SYSENTER depending on the capabilities of the CPU. (Though in practice 32-bit software may just use INT $0x80 directly.) – Ross Ridge Aug 02 '18 at 22:41
1

@HadiBrais: 32-bit Linux system calls are normally made with `sysenter`. The kernel side of it is in [`entry_64_compat.S`](https://github.com/torvalds/linux/blob/e7d0c41ecc2e372a81741a30894f556afec24315/arch/x86/entry/entry_64_compat.S), along with other entry points into a 64-bit kernel from 32-bit userspace (see also [What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?](https://stackoverflow.com/q/46087730) for more about `int 0x80`). The user-space side of the `sysenter` save/restore dance done by code in the VDSO page the kernel maps into every process, called by glibc – Peter Cordes Aug 03 '18 at 06:37
Congratulations for your new badge! – fuz Sep 12 '18 at 17:18

SYSRET vs SYSRETQ distinction and compatibility mode

1 Answers1