Don't try to put 64-bit machine-code inside a compiler-generated function. It might work since the encoding for function prologue/epilogue is the same in 32 and 64-bit, but it would be cleaner to just have a separate block of 64-bit code.
The easiest thing is probably to assemble that block in a separate file, using GAS .code64
or NASM BITS 64
to get 64-bit code in an object file you can link into a 32-bit executable.
You said in a comment you're thinking of using this for a kernel exploit against a 64-bit kernel from a 32-bit user-space process, so you just need some code bytes in an executable part of your process's memory and a way to get a pointer to that block. This is certainly plausible; if you can gain control of the kernel's RIP from a 32-bit process, this is what you want, because kernel code will always be running in long mode.
If you were doing something with 64-bit userspace code in a process that started in 32-bit mode, you could maybe far jmp
to the block of 64-bit code (as @RossRidge suggests), using a known value for the kernel's __USER_CS
64-bit code segment descriptor. syscall
from 64-bit code should return in 64-bit mode, but if not, try the int 0x80
ABI. It always returns to the mode you were in, saving/restoring cs
and ss
along with rip
and rflags
. (What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?)
.rodata
is part of the test segment of your executable, so just get the compiler to put bytes in a const
array. Fun fact: const int main = 195;
compiles to a program that exits without segfaulting, because 195
= 0xc3
= the x86 encoding for ret
(and x86 is little-endian). For an arbitrary-length machine-code sequence, const char funcname[] = { 0x90, 0x90, ..., 0xc3 }
will work. The const
is necessary, otherwise it will go in .data
(read/write/noexec) instead of .rodata
.
You could use const char funcname[] __attribute__((section(".text"))) = { ... };
to control what section it goes in (e.g. .text
along with compiler-generated functions), or even a linker script to get more control.
If you really want to do it all in one .c
file, instead of using the easier solution of a separately-assembled pure asm source:
To assemble some 64-bit code along with compiler-generated 32-bit code, use the .code64
GAS directive in an asm
statement *outside of any functions. IDK if there's any guarantee on what section will be active when gcc emits your asm how gcc will mix that asm with its asm, but it won't put it in the middle of a function.
asm(".pushsection .text \n\t" // AFAIK, there's no guarantee how this will mix with compiler asm output
".code64 \n\t"
".p2align 4 \n\t"
".globl my_codebytes \n\t" // optional
"my_codebytes: \n\t"
"inc %r10d \n\t"
"my_codebytes_end: \n\t"
//"my_codebytes_len: .long . - my_codebytes\n\t" // store the length in memory. Optional
".popsection \n\t"
#ifdef __i386
".code32" // back to 32-bit interpretation for gcc's code
// "\n\t inc %r10" // uncomment to check that it *doesn't* assemble
#endif
);
#ifdef __cplusplus
extern "C" {
#endif
// put C names on the labels.
// They are *not* pointers, their addresses are link-time constants
extern char my_codebytes[], my_codebytes_end[];
//extern const unsigned my_codebytes_len;
#ifdef __cplusplus
}
#endif
// This expression for the length isn't a compile-time constant, so this isn't legal C
//static const unsigned len = &my_codebytes_end - &my_codebytes;
#include <stddef.h>
#include <unistd.h>
int main(void) {
size_t len = my_codebytes_end - my_codebytes;
const char* bytes = my_codebytes;
// do whatever you want. Writing it to stdout is one option!
write(1, bytes, len);
}
This compiles and assembles with gcc and clang (compiler explorer).
I tried it on my desktop to double check:
peter@volta$ gcc -m32 -Wall -O3 /tmp/foo.c
peter@volta$ ./a.out | hd
00000000 41 ff c2 |A..|
00000003
This is the correct encoding for inc %r10d
:)
The program also works when compiled without -m32
, because I used #ifdef
to decide whether to use .code32
at the end or not. (There's no push/pop mode directive like there is for sections.)
Of course, disassembling the binary will show you:
00000580 <my_codebytes>:
580: 41 inc ecx
581: ff c2 inc edx
because the disassembler doesn't know to switch to 64-bit disassembly for that block. (I wonder if ELF has attributes for that... I didn't use any assembler directives or linker scripts to generate such attributes, if such a thing exists.)