0

I am looking to write a program that will need to do low level work with processes (ie. using the fork system call, among others). This program is to be written in C++ and is to run only on Linux. Ideally, it will be portable across CPU architectures (ie. x86, x86_64, and arm) with nothing more than a recompile, but I only really need x86_64 support.

As each Linux system call takes a number of arguments and returns a number of arguments in cpu registers (often only 1 return value), then a C function wrapper for each system call is likely easy to make. Also, because, AFAIK, system calls, being implemented in the kernel, have identical arguments and return values, if different assembly-level implementations, the same C interface can be exposed.

Does such a thing exist? If so, how can I access it?

Where is its documentation (list of available functions, their arguments with an explanation, and an explanation of exactly what the function does)?

πάντα ῥεῖ
  • 1
  • 13
  • 116
  • 190
john01dav
  • 1,842
  • 1
  • 21
  • 40
  • I see that someone has downvoted and likely the same person has voted to close. I suspect this is because the question is phrased as if asking for a recommendation for a tool. While this is technically true, I think the question is on-topic because I strongly suspect that the Linux kernel itself (or maybe C standard library implementations on Linux, or some other standard and almost-always-used tools) contains such a feature, thus making this question "how do I use that feature," which is definitely on-topic. – john01dav Jul 01 '18 at 06:10
  • 1
    There is a `syscall` function in libc, maybe you need this? – geza Jul 01 '18 at 06:14
  • @πάνταῥεῖ Which man pages should I be looking at? – john01dav Jul 01 '18 at 06:16
  • @geza The man page for `syscall` says "Employing syscall() is useful, for example, when invoking a system call that has no wrapper function in the C library." Where is the documentation on these wrapper functions? Are they a standard part of C? – john01dav Jul 01 '18 at 06:17
  • Wrapper functions means that a lot of system calls has wrapper functions in libc, like `open`, `read`, etc. `syscall` is usually used, when such a wrapper function doesn't exist. But you can call any system call with `syscall`. – geza Jul 01 '18 at 06:19
  • @john01dav We could probably help you better, if you could tell which _low level_ functionality you need to access in particular. May be there are already standard alternatives. – πάντα ῥεῖ Jul 01 '18 at 06:22
  • @πάνταῥεῖ I am looking to launch processes without the use of a shell using `fork` and `exec`. I can't use a shell because I want to try writing a shell. – john01dav Jul 01 '18 at 06:23
  • 3
    @john01dav `fork` and `exec` commands do not involve an intermediary shell process. Also take a look at busybox, which is particularly useful for small systems. – πάντα ῥεῖ Jul 01 '18 at 06:24
  • @πάνταῥεῖ I realize that, but those are system calls that I need to know how to access in C, hence the question. I said that I can't use a shell because `std::system` relies on a shell, and that is what I have used in the past to launch another process. – john01dav Jul 01 '18 at 06:26
  • 1
    @john01dav Again, `fork` and `exec` don't involve a shell process unlike the `std::system` function. And `exec` is well documented with `man`. – πάντα ῥεῖ Jul 01 '18 at 06:28
  • @πάνταῥεῖ .... I know, as I said. I need to know how to access them in C. Just because there is no other shell does not make me magically aware of how to access them. Do the have wrapper functions somewhere? If so, where is its documentation? – john01dav Jul 01 '18 at 06:29
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/174106/discussion-between---and-john01dav). – πάντα ῥεῖ Jul 01 '18 at 06:30
  • 7
    You don't need `syscall` then. Just use the usual libc functions, `exec`, `fork`. They don't use the shell. I see that you "know" this, but then I don't understand this: "I am looking to launch processes without the use of a shell using fork and exec". These functions has nothing to do with the shell. – geza Jul 01 '18 at 06:32
  • If you try using `syscall()`, your code will be inherently system dependent. If you use the (cover functions for the) named system calls — such as `fork()` or `execve()` — you will be able to write platform-independent code (at least, the code only depends on the system supporting POSIX interfaces). – Jonathan Leffler Jul 01 '18 at 06:38
  • @geza I'll rephrase the second quote: "I am looking to launch processes using fork and exec without the use of a shell" Hopefully that makes it clearer. Sorry for the ambiguous phrasing. Regarding the libc functions, I looked at cpprefernece, which includes information on the C api, and I don't see anything there, meaning it isn't a standard part of the C api. Where can I find information on this? – john01dav Jul 01 '18 at 06:42
  • @john01dav `glibc` and `newlib` are part of the operating system documentation, not of the c and c++ standard libraries, which come with the toolchain (the latter are documented at cppreference.com). – πάντα ῥεῖ Jul 01 '18 at 06:47
  • 3
    Please read carefully, what we say. `fork` and `exec` **don't** use the shell. cppreference documents C++, not C api. `fork` and `exec` is part of the POSIX api, availiable on all OSes which implmenets POSIX (linux, etc.) – geza Jul 01 '18 at 06:48
  • @john01dav An extra tip: I you want to implement a lightweight shell have a look at the [busybox](https://busybox.net/about.html) as mentioned earlier. – πάντα ῥεῖ Jul 01 '18 at 07:28

2 Answers2

12

libc already includes the wrapper functions you're looking for. The prototypes for many of them are in #include <unistd.h>, as specified by POSIX.

C is the language of low-level systems program on Unix (and Linux), so this has been a thing since Unix existed. (Providing wrapper functions in libc is easier than teaching compilers the difference between function call and system calls, and allows for setting errno on errors. It also allows for tricks like LD_PRELOAD to intercept system calls in user-space.)


The man pages for system calls are in section 2, vs. section 3 for library functions (which might or might not use system calls as part of their implementation: math.h cos(3), ISO C stdio printf(3) and fwrite(3), vs. POSIX write(2)).

execve(2) is the system call.

See execl(3) and friends are also part of libc, and eventually call execve(2). They are convenience wrappers on top of it for constructing the argv array, doing $PATH lookup, and passing along the current process's environment. Thus they're classed as functions, not system calls.

See syscalls(2) for an overview, and complete list of system Linux calls with links to their man-page wrappers. (I've linked the Linux man pages, but there are also POSIX man pages for all of the standard system calls.)


In the unlikely case that you're not linking libc, you can use macros like MUSL's syscall2 / syscall3 / etc. macros (the number is the arg count) to inline the right asm on whatever platform. You use __NR_write from asm/unistd.h to get system call numbers.

But note that the raw Linux system calls might have small differences from the interface provided by the libc wrappers. For example, they won't check for pthreads cancellation points, and brk / sbrk requires bookkeeping in user-space by libc.

See SYSCALL_INLINE in Android for a portable raw sys_write() inline wrapper using MUSL macros.

But if you are using libc like a normal person for functions like malloc and printf, you should just use its system call wrapper functions.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
3

The syscalls(2) man page lists every system call available on Linux (and gives a link to the documentation of each of them). Most of them have their C wrapper in libc (for example, write(2), fork(2) etc etc...). A typical system call wrapper manages the calling conventions (see x86 ABI specifications here) and sets errno(3) on failure. ALP is a good but old introduction to Linux system programming, but you might find something newer (and ALP don't mention recent system calls like signalfd(2) because when ALP was written, these system calls did not exist).

Most C standard library implementations (e.g. your libc.so) on Linux provide the POSIX interface to system calls. And they usually are free software (e.g. GNU glibc or musl-libc and others). So if you care about gory implementation details (you usually shouldn't), study (or improve) their source code.

Very few system calls are not interfaced by the libc, because they are unusual and don't make much sense in C code. For example, sigreturn(2), socketcall(2), gettid(2) (or renameat2(2); you'll use renameat instead). If you really need to use these directly (which is improbable and likely to be a design bug in your program) you need to code some assembler code (specific to your system and instruction set architecture) or perhaps use syscall(2).

Some system calls evolved with time or appeared in later kernels but did not exit ten years ago. The system call numbers (as understood by the kernel) might be listed in some asm/unistd_64.h file (which you probably don't want to include, prefer sys/syscalls.h instead). For example, the preadv(2) syscall is redirected to either __NR_preadv or __NR_preadv2 but your libc should be clever enough to do the best it can.

Some new system calls did not exist in old kernels. A recent libc might in that case "emulate" them otherwise. But you should trust your libc implementation (and your kernel) most of the time. In practice, libc.so is the cornerstone of your Linux system and distribution (and you'll better use it as shared library and avoid statically linking it because of nsswitch.conf(5)). If you need to understand in details how shared libraries work, read Drepper's How to Write Shared Libraries. If you want some gory details about the system call mechanism in userland, see perhaps Assembler HowTo.

In almost all cases, you write somehow portable C code and use only the functions documented in syscalls(2) (as having a C wrapper) and intro(2).

In practice your shell-like program would use fork(2), execve(2), waitpid(2) etc. All these are specified by POSIX and available (and wrapped) in libc. You could study the source code of some free software shell for inspiration.

For the purpose of C programming on Linux, consider as system call any functions listed in syscalls(2) and having a C wrapper (e.g. almost all of them). So socket(2) or bind(2) is also in practice a system call (even if both internally use socketcall(2), which you won't call directly) Notice that system(3) -a very poorly named function for historical reasons- is not a system call. It is implemented above fork(2), execve(2), signal(2), waitpid(2) etc... and requires /bin/sh ...

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • 1
    Or if the system call is very new, e.g. if your kernel is newer than your libc, there can be new system calls that don't have wrappers *yet*. New kernel / old libc is probably not uncommon on "old stable release" distros on newer hardware, using an updated kernel but old user-space. (Of course, using new Linux-specific system calls is rare in the first place.) – Peter Cordes Jul 01 '18 at 07:28
  • 1
    If you're aiming for portability, `#include `. It will include `asm/unistd_64.h` or `asm/unistd_32.h`, or `asm/unistd_x32.h`, as appropriate. (Or on non-x86 targets, some other header). Linux's fairly strict adherence to "don't break userspace" means that using the old version of a system call should still work, e.g. `__NR_mmap` still works on 32-bit, but the max file offset is only 2^32-1 bytes (not 2^32-1 pages with `mmap2`). Good point that libc will choose the best system call for the target. – Peter Cordes Jul 01 '18 at 08:38
  • 1
    If you `#include `, your code will invoke the wrong system calls when compiled with `-m32` (assuming you're using a syscall wrapper macro to take care of the calling convention differences so it doesn't just segfault or SIGILL on a 32-bit `syscall` instruction). – Peter Cordes Jul 01 '18 at 08:39