3

I'm toying ptrace with the code below. I found that the system call number for execve was 59 even when I compiled with the -m32 option. Since I'm using Ubuntu on a 64-bit machine, it could be understandable.

Soon, the question arose: "Do libc32 behave differently on 32-bit machine and 64-bit machine? Are they different?" So I checked what libc32 had in 64-bit. However, the execve system call number for libc was 11, which was identical the execv system call number for 32-bit systems. So where does the magic happen? Thank you in advance.

Here's the code. It's originated from https://www.linuxjournal.com/article/6100

#include <sys/ptrace.h>                                                                                                           
#include <sys/types.h>                                                                                                            
#include <sys/wait.h>                                                                                                             
#include <unistd.h>                                                                                                               
#include <sys/user.h>                                                                                                             
#include <stdio.h>                                                                                                                
                                                                                                                                  
int main()                                                                                                                        
{                                                                                                                                 
        pid_t child;                                                                                                              
        long  orig_eax;                                                                                                           
        child = fork();                                                                                                           
        if (child == 0) {                                                                                                         
                ptrace(PTRACE_TRACEME, 0, NULL, NULL);                                                                            
                execl("/bin/ls", "ls", NULL);                                                                                     
        } else {                                                                                                                  
                wait(NULL);                                                                                                       
                orig_eax = ptrace(PTRACE_PEEKUSER,                                                                                
#ifdef __x86_64__                                                                                                                 
                                child, &((struct user_regs_struct *)0)->orig_rax,                                                 
#else                                                                                                                             
                                child, &((struct user_regs_struct *)0)->orig_eax,                                                 
#endif                                                                                                                            
                                NULL);                                                                                            
                printf("The child made a "                                                                                       
                        "system call %ld\n", orig_eax);                                                                           
                ptrace (PTRACE_CONT, child, NULL, NULL);                                                                          
        }                                                                                                                         
        return 0;                                                                                                                 
}

Here's result from the code

~/my-sandbox/ptrace$ file s1 && ./s1
s1: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=f84894c2f5373051682858937bf54a66f21cbeb4, for GNU/Linux 3.2.0, not stripped
The child made a system call 59

~/my-sandbox/ptrace$ file s2 && ./s2
s2: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, BuildID[sha1]=cac6a2bbeee164e27c11764c1b68f4ddd06405cf, for GNU/Linux 3.2.0, with debug_info, not stripped
The child made a system call 59

This is what I got from 32bit executable using gdb. As you can see it's using /lib/i386-linux-gnu/libc.so.6, and the system call number for execve is 11.

>>> bt
#0  0xf7e875a0 in execve () from /lib/i386-linux-gnu/libc.so.6
#1  0xf7e8799f in execl () from /lib/i386-linux-gnu/libc.so.6
#2  0x565562a4 in main () at simple1.c:15
>>> disassemble
Dump of assembler code for function execve:
=> 0xf7e875a0 <+0>: endbr32 
   0xf7e875a4 <+4>: push   %ebx
   0xf7e875a5 <+5>: mov    0x10(%esp),%edx
   0xf7e875a9 <+9>: mov    0xc(%esp),%ecx
   0xf7e875ad <+13>:    mov    0x8(%esp),%ebx
   0xf7e875b1 <+17>:    mov    $0xb,%eax
   0xf7e875b6 <+22>:    call   *%gs:0x10
   0xf7e875bd <+29>:    pop    %ebx
   0xf7e875be <+30>:    cmp    $0xfffff001,%eax
   0xf7e875c3 <+35>:    jae    0xf7dd9000
   0xf7e875c9 <+41>:    ret    
End of assembler dump.
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
dyjin
  • 33
  • 4
  • 1
    libc used the VDSO wrapper exported by the kernel, so it can benefit from `sysenter` on machines that support it, instead of slower `int $0x80`. See https://blog.packagecloud.io/eng/2016/04/05/the-definitive-guide-to-linux-system-calls/ for an overview of that. – Peter Cordes Dec 25 '21 at 14:25
  • (my prev. comment was based on just the title). As you can see, the 32-bit libc execve wrapper is passing EAX=11 with `mov $0xb,%eax`. IDK why your program is finding that EAX=59 in that case. I would definitely also expect 11 there, `__NR_execve` from `unistd_32.h`, especially since we can see the libc wrapper passing that. Unlike Windows WOW64, 32-bit user-space on Linux calls into the kernel directly, not far-jumping to 64-bit mode for `syscall`. So yes, 32-bit user-space libraries aren't special on a 64-bit system; a 64-bit kernel simply provides a 32-bit ABI for 32-bit processes. – Peter Cordes Dec 25 '21 at 16:35
  • Yes. I checked that libc32 on 32-bit machin is not different from libc32 on 64-bit machin. My question is altough libc hand 11 (on 64-bit machine with 32-bit elf) for execve why kernel (which is 64-bit) treat it as 59 and how. – dyjin Dec 26 '21 at 05:36
  • Yes, I could see that kernel could/should only reserve 59 for exeve. then where 11, execve on 32-bit elf, transform to 59 ? – dyjin Dec 26 '21 at 05:40
  • Perhaps inside the kernel, as an implementation detail of the 32-bit syscall wrapper, on the way to executing the execve code that checks for `PTRACE_TRACEME` actually raises SIGTRAP on execve? – Peter Cordes Dec 26 '21 at 06:17

1 Answers1

2

execve is special; it's the only one that has special interaction with PTRACE_TRACEME. The way strace works, other system calls do show the 32-bit call number. (And modern strace needs special help to know whether that's a 32-bit call number for int 0x80 / sysenter, or a 64-bit call number, since 64-bit processes can still invoke int 0x80, although they normally shouldn't. This support was only added in 2019, with PTRACE_GET_SYSCALL_INFO)


You're right, when the kernel is actually invoked, EAX holds 11, __NR_execve from unistd_32.h. It's set by mov $0xb,%eax before glibc's execve wrapper jumps to the VDSO page to enter the kernel via whatever efficient method is supported on this hardware (normally sysenter.)

But execution doesn't actually stop until it reaches some code in the main execve implementation that checks for PTRACE_TRACEME and raises SIGTRAP.

Apparently sometime before that happens, it calls void set_personality_64bit(void) in arch/x86/kernel/process_64.c, which includes

    /* Pretend that this comes from a 64bit execve */
    task_pt_regs(current)->orig_ax = __NR_execve;

I found that by searching for __NR_execve in a kernel source browser, and looking at the most likely file in arch/x86. I didn't keep cross-referencing to find where that's called from; the fact that it exists (and the assumption of a sane non-obfuscated design) points very strongly to this being the answer to your mystery.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847