14

A 64 bit CPU (amd64) supports 32 bit Intel instructions in the compatibility mode. Also, a 64 bit Linux installation allows running ELFs containing 32 bit instructions if the ELF header says it is a 32 bit executable.

I was wondering if it is possible to put some assembly instructions inside the ELF which switches the CPU to 32 bit compatibility mode in the middle of a program (and then later back again)? If these assembly instructions are not permitted by the kernel, is there perhaps some way we can get the kernel to switch an already running process to 32 bit?

This is mainly a question out of curiosity since I cannot really see any use-cases for it.

Ross Ridge
  • 38,414
  • 7
  • 81
  • 112
rubund
  • 7,603
  • 3
  • 15
  • 24
  • you can actually use all 32bit instructions inside 64bit assembly without any thing to do, because all 64bit-instructions need to have a rex-prefix. the only tricky things you have to keep in mind are addresses and the rex-prefixes them self, because they are valid 32bit-instructions. There should not be any problem, as long you control the 32bit.code. switching between ›real‹ 32bit and 64bit is afaik ring0-stuff, so this depend on the OS. But transforming 32bit-code to 64bit-code is not that difficult and could be an option? – sivizius Feb 18 '18 at 17:41
  • @sivizius : You can FAR Jmp using existing descriptors in the LDT to jump between compatibility mode and long mode. – Michael Petch Feb 18 '18 at 18:51
  • 1
    Although not an exact duplicate these SO Answers are related: https://stackoverflow.com/questions/34467092/inline-64bit-assembly-in-32bit-gcc-c-program#comment56687360_34467109 – Michael Petch Feb 18 '18 at 19:05
  • 1
    Not every 64b linux contains 32b compatibility fallback, for example the Ubuntu in Win10 is pure 64b. That means only 64b `syscall` is supported. The instructions encoded to operate with 32b registers under 64b mode are valid, but you can't run elf32 binary, like run some code in 32 bit mode only. – Ped7g Feb 18 '18 at 20:20
  • 5
    @sivizius That's completely wrong! As a single example, take `push eax` which exists in protected mode but not not in long mode. – fuz Feb 18 '18 at 20:41
  • I have found an article related to my question here btw: http://blog.dolezel.info/2017/02/running-32-bit-code-in-64-bit-linux.html – rubund Feb 19 '18 at 21:58
  • @fuz because the encoded `push eax` is encoded the same as `push rax` in long mode and when you execute it, it will push the a-register. as I said, the only thing that is different are addresses (rsp will be increased by 8 and not 4). if you just want to save the full a-register, 0x50 will do it. – sivizius Jun 17 '18 at 15:34
  • @sivizius Your statement was “you can actually use all 32bit instructions inside 64bit assembly without any thing to do, because all 64bit-instructions need to have a rex-prefix.” All of this is wrong. Some 32 bit instructions are plain unavailable in 64 bit mode (such as arpl or aaa). Some 64 bit instructions do not need a REX prefix (such as movsx r,r/m32 or the aforementioned push r64). push r64 is by far not the same thing as push r32. That's why I said that your statement is completely wrong. – fuz Jun 17 '18 at 18:26

1 Answers1

13

Switching between long mode and compatibility mode is done by changing CS. User mode code cannot modify the descriptor table, but it can perform a far jump or far call to a code segment that is already present in the descriptor table. I think that in Linux (for example) the required compatibility mode descriptor is present.

Here is sample code for Linux (Ubuntu). Build with

$ gcc -no-pie switch_mode.c switch_cs.s

switch_mode.c:

#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>

extern bool switch_cs(int cs, bool (*f)());
extern bool check_mode();

int main(int argc, char **argv)
{
    int cs = 0x23;
    if (argc > 1)
        cs = strtoull(argv[1], 0, 16);
    printf("switch to CS=%02x\n", cs);

    bool r = switch_cs(cs, check_mode);

    if (r)
        printf("cs=%02x: 64-bit mode\n", cs);
    else
        printf("cs=%02x: 32-bit mode\n", cs);

    return 0;
}

switch_cs.s:

        .intel_syntax noprefix
        .code64
        .text
        .globl switch_cs
switch_cs:
        push    rbx
        push    rbp
        mov     rbp, rsp
        sub     rsp, 0x18

        mov     rbx, rsp
        movq    [rbx], offset .L1
        mov     [rbx+4], edi

        // Before the lcall, switch to a stack below 4GB.
        // This assumes that the data segment is below 4GB.
        mov     rsp, offset stack+0xf0
        lcall   [rbx]

        // restore rsp to the original stack
        leave
        pop     rbx
        ret

        .code32
.L1:
        call    esi
        lret


        .code64
        .globl check_mode
// returns false for 32-bit mode; true for 64-bit mode
check_mode:
        xor     eax, eax
        // In 32-bit mode, this instruction is executed as
        // inc eax; test eax, eax
        test    rax, rax
        setz    al
        ret

        .data
        .align  16
stack:  .space 0x100
rubund
  • 7,603
  • 3
  • 15
  • 24
prl
  • 11,716
  • 2
  • 13
  • 31
  • You can't modify the GDT but you can modify the processes LDT (with restrictions). using [modify_ldt](http://man7.org/linux/man-pages/man2/modify_ldt.2.html) . You can however (as you suggest) use existing descriptors to [far jump back and forth](https://stackoverflow.com/questions/34467092/inline-64bit-assembly-in-32bit-gcc-c-program#comment56687360_34467109) between long mode and compatibility mode – Michael Petch Feb 18 '18 at 18:49
  • Will interrupts / context switches always save/restore `cs`, or could the kernel return to user-space with the CS value it thinks is right for a process which started in 64-bit mode? Related: I think using `sysenter` in 64-bit mode will return to user-space in 32-bit mode on Linux: https://stackoverflow.com/questions/46087730/what-happens-if-you-use-the-32-bit-int-0x80-linux-abi-in-64-bit-code/46087731#46087731. And 64-bit `syscall` will set `CS` to `__user_cs` on return to user-space, even if you'd created a custom LDT entry and had CS set to that like @MichaelPetch suggested. – Peter Cordes Feb 19 '18 at 04:48
  • @MichaelPetch: I don't think `modify_ldt` can set the `R` bit in an LDT entry, so you can't create 64-bit code segments. The man page says "Even on 64-bit kernels, `modify_ldt()` cannot be used to create a long mode (i.e., 64-bit) code segment". But a 64-bit process could create a 32-bit code segment and jump back and forth between that and its initial code segment. – Peter Cordes Feb 19 '18 at 04:52
  • @PeterCordes : I never said it could. Note how I mentioned restrictions. My comment was related to the comment in the answer that says user mode can't change the descriptor table. It can change the LDT with restrictions but can't change the GDT. As I also said in my answer you can use existing descriptors to do the far jump between the two modes. That doesn't require changing the descriptor table. – Michael Petch Feb 19 '18 at 04:55
  • @MichaelPetch: When I started to write that comment, I was *only* thinking of a 32-bit process wanting to create a 64-bit code segment, in which case `modify_ldt` doesn't let you make the necessary modification. That's why the wording of the first part sounds like I'm disagreeing with something you said. Anyway, a significant limitation, IMO. – Peter Cordes Feb 19 '18 at 04:58
  • @prl: Thank you! I tried compiling the C code with gcc, and the assembly part with nasm, but had problems linking it. I am not sure exactly how I can use what you have added. Could you maybe add some more details in addition to the "built with gcc" part ? :) – rubund Feb 19 '18 at 21:43
  • @all: Is there really anything going on in https://elixir.bootlin.com/linux/v3.5/source/fs/binfmt_elf.c which is specific when loading a 32 bit ELF on a machine with a 64 bit kernel? Or how is this handled ? – rubund Feb 19 '18 at 21:44
  • @all: I have posted a separate question for "what the kernel does differently when starting a 32 bit process vs a 64 bit process": https://stackoverflow.com/questions/48874756/how-does-the-64-bit-linux-kernel-kick-of-a-32-bit-process-from-an-elf – rubund Feb 19 '18 at 21:55
  • @prl: ok, perfectly fine then. thanks! I actually meant "gas", not "nasm". But yes, it assembled :) – rubund Feb 20 '18 at 06:53
  • @prl: Had to add the "-no-pie" switch when building. Without, it bails out with an error. – rubund Feb 20 '18 at 06:56
  • What OS version did you try it with? – prl Feb 20 '18 at 07:35
  • @prl: Debian Stretch (9) – rubund Feb 20 '18 at 11:53