8

I'm investigating the implementation detail of seccomp-bpf, the syscall filtration mechanism that was introduced into Linux since version 3.5. I looked into the source code of kernel/seccomp.c from Linux 3.10 and want to ask some questions about it.

From seccomp.c, it seems that seccomp_run_filters() is called from __secure_computing() to test the syscall called by the current process. But looking into seccomp_run_filters(), the syscall number that is passed as an argument is not used anywhere.

It seems that sk_run_filter() is the implementation of BPF filter machine, but sk_run_filter() is called from seccomp_run_filters() with the first argument (the buffer to run the filter on) NULL.

My question is: how can seccomp_run_filters() filter syscalls without using the argument?

The following is the source code of seccomp_run_filters():

/**
 * seccomp_run_filters - evaluates all seccomp filters against @syscall
 * @syscall: number of the current system call
 *
 * Returns valid seccomp BPF response codes.
 */
static u32 seccomp_run_filters(int syscall)
{
        struct seccomp_filter *f;
        u32 ret = SECCOMP_RET_ALLOW;

        /* Ensure unexpected behavior doesn't result in failing open. */
        if (WARN_ON(current->seccomp.filter == NULL))
                return SECCOMP_RET_KILL;

        /*
         * All filters in the list are evaluated and the lowest BPF return
         * value always takes priority (ignoring the DATA).
         */
        for (f = current->seccomp.filter; f; f = f->prev) {
                u32 cur_ret = sk_run_filter(NULL, f->insns);
                if ((cur_ret & SECCOMP_RET_ACTION) < (ret & SECCOMP_RET_ACTION))
                        ret = cur_ret;
        }
        return ret;
}
jopasserat
  • 5,721
  • 4
  • 31
  • 50

1 Answers1

3

When a user process enters the kernel, the register set is stored to a kernel variable. The function sk_run_filter implements the interpreter for the filter language. The relevant instruction for seccomp filters is BPF_S_ANC_SECCOMP_LD_W. Each instruction has a constant k, and in this case it specifies the index of the word to be read.

#ifdef CONFIG_SECCOMP_FILTER
            case BPF_S_ANC_SECCOMP_LD_W:
                    A = seccomp_bpf_load(fentry->k);
                    continue;
#endif

The function seccomp_bpf_load uses the current register set of the user thread to determine the system call information.

Juho Östman
  • 1,544
  • 1
  • 12
  • 20
  • Thanks, Juho. I have understood how the instruction BPF_S_ANC_SECCOMP_LD_W is used to tell the interpreter to load syscall information from struct seccomp_dara. –  Jan 23 '14 at 10:31