1

I'm reading some source code of linux kernel, I know little about inline assembly, and I found some part of inline assembly are hard to understand.

Take the following for example:

// SPDX-License-Identifier: GPL-2.0-only
/*
 * check_initial_reg_state.c - check that execve sets the correct state
 * Copyright (c) 2014-2016 Andrew Lutomirski
 */

#define _GNU_SOURCE

#include <stdio.h>

unsigned long ax, bx, cx, dx, si, di, bp, sp, flags;
unsigned long r8, r9, r10, r11, r12, r13, r14, r15;

asm (
    ".pushsection .text\n\t"
    ".type real_start, @function\n\t"
    ".global real_start\n\t"
    "real_start:\n\t"
#ifdef __x86_64__
    "mov %rax, ax\n\t"
    "mov %rbx, bx\n\t"
    "mov %rcx, cx\n\t"
    "mov %rdx, dx\n\t"
    "mov %rsi, si\n\t"
    "mov %rdi, di\n\t"
    "mov %rbp, bp\n\t"
    "mov %rsp, sp\n\t"
    "mov %r8, r8\n\t"
    "mov %r9, r9\n\t"
    "mov %r10, r10\n\t"
    "mov %r11, r11\n\t"
    "mov %r12, r12\n\t"
    "mov %r13, r13\n\t"
    "mov %r14, r14\n\t"
    "mov %r15, r15\n\t"
    "pushfq\n\t"
    "popq flags\n\t"
#else
    "mov %eax, ax\n\t"
    "mov %ebx, bx\n\t"
    "mov %ecx, cx\n\t"
    "mov %edx, dx\n\t"
    "mov %esi, si\n\t"
    "mov %edi, di\n\t"
    "mov %ebp, bp\n\t"
    "mov %esp, sp\n\t"
    "pushfl\n\t"
    "popl flags\n\t"
#endif
    "jmp _start\n\t"
    ".size real_start, . - real_start\n\t"
    ".popsection");

int main()
{
    int nerrs = 0;

    if (sp == 0) {
        printf("[FAIL]\tTest was built incorrectly\n");
        return 1;
    }

    if (ax || bx || cx || dx || si || di || bp
#ifdef __x86_64__
        || r8 || r9 || r10 || r11 || r12 || r13 || r14 || r15
#endif
        ) {
        printf("[FAIL]\tAll GPRs except SP should be 0\n");
#define SHOW(x) printf("\t" #x " = 0x%lx\n", x);
        SHOW(ax);
        SHOW(bx);
        SHOW(cx);
        SHOW(dx);
        SHOW(si);
        SHOW(di);
        SHOW(bp);
        SHOW(sp);
#ifdef __x86_64__
        SHOW(r8);
        SHOW(r9);
        SHOW(r10);
        SHOW(r11);
        SHOW(r12);
        SHOW(r13);
        SHOW(r14);
        SHOW(r15);
#endif
        nerrs++;
    } else {
        printf("[OK]\tAll GPRs except SP are 0\n");
    }

    if (flags != 0x202) {
        printf("[FAIL]\tFLAGS is 0x%lx, but it should be 0x202\n", flags);
        nerrs++;
    } else {
        printf("[OK]\tFLAGS is 0x202\n");
    }

    return nerrs ? 1 : 0;
}

This code is pasted from the lib/tools/testing/selftests/x86/check_initial_reg_state.c . I cannot understand line 14 asm( which directly declares a lot of inline assembly but they are not contained in any function. How this work?

Nicholas
  • 127
  • 1
  • 11
  • 1
    The compiler emits the assembly instructions inline. The C code is translated. What is confusing you? – h0r53 Apr 26 '22 at 15:30
  • It appears to be defining a function named `real_start`, entirely in inline assembly. – John Bollinger Apr 26 '22 at 15:30
  • I suppose that the compiler just puts the assembly code into the file, and it's not the compiler's problem how it actually gets called. – user253751 Apr 26 '22 at 15:30
  • 1
    Conceptually, the compiler emits assembly. Functions are translated to assembly constructs like labels and instructions. The inline assembly just adds more to this, bypassing the compiler. After that, the assembler produces the binary (machine code plus some structure around that like symbol tables). – dyp Apr 26 '22 at 15:31
  • GNU C inline asm just outputs those lines as text into the `.s` output created by the compiler. It defines a function, `real_start:`. Or actually not a function, the process entry point for this user-space program that runs under the kernel as a test. Same idea as [How Get arguments value using inline assembly in C without Glibc?](https://stackoverflow.com/q/50260855) (where a couple answers point out that it's not *safely* possible without writing asm, since `_start` isn't a function that's called normally). – Peter Cordes Apr 26 '22 at 15:31
  • Thank you! I see. It decares a function. – Nicholas Apr 27 '22 at 02:22

0 Answers0