2

From my understanding the function kallsyms_lookup_name is not exported by the kernel, but a lot of modules use tricks to access it anyways. I was initially trying to use a module which uses kprobe to get a pointer to kallsyms_lookup_name, but when the pointer was called I get a segmentation fault. I've boiled that code down to the bare minimum to reproduce the error:

typedef unsigned long (*kallsymsFn)(const char *);

// Not part of the problem, just a utility function to log the symbol name of any found functions. The seg fault still happens without this.
static void log_symbol(void *symbol)
{
    char *fname_lookup = kzalloc(NAME_MAX, GFP_KERNEL);
    if (!fname_lookup)
        return;
    sprint_symbol(fname_lookup, symbol);
    printk(KERN_INFO "Got '%s'.\n", fname_lookup);
    kfree(fname_lookup);
}

// Function which initalizes the module
static int memflow_init(void)
{
    struct kprobe kp = {0};
    int ret = 0;
    kp.symbol_name = "kallsyms_lookup_name";

    printk(KERN_INFO "Regsitered kprobe.\n");

    ret = register_kprobe(&kp);

    if (ret < 0)
        return ret;

    kallsymsFn kallsyms = (kallsymsFn)kp.addr;
    log_symbol(kallsyms);

    unregister_kprobe(&kp);
    printk(KERN_INFO "Unregsietered kprobe.\n");

    // Seg fault happens when trying to call the found pointer to kallsyms_lookup_name
    log_symbol(kallsyms("kvm_lock"));

    return 0;
}

The full dmesg output can be found here, but these are what I believe to be the important bits:

[  +3.821286] Regsitered kprobe.
[  +0.000815] Got 'kallsyms_lookup_name+0x4/0xd0'.
[  +0.048238] Unregsietered kprobe.
[  +0.000003] traps: Missing ENDBR: kallsyms_lookup_name+0x4/0xd0
...
[  +0.000004]  ? kallsyms_lookup_name+0x4/0xd0
[  +0.000012]  memflow_init+0x82/0xc0 [memflow 21eb138fb4e295ab5c057ad4052116921bf19468]
[  +0.000018]  ? kallsyms_lookup_name+0x4/0xd0
[  +0.000008]  ? __pfx_init_module+0x10/0x10 [memflow 21eb138fb4e295ab5c057ad4052116921bf19468]

As you can see, the function is found fine, but crashes and I don't know why. This clearly works for other people as well but there is something about my setup which causes this to break. My kernel is 6.2.13-arch1-1 and both CONFIG_KALLSYMS=y, and CONFIG_KALLSYMS_ALL=y are present in /proc/config.gz.

Things I have tried:

I have tried to fix this for many hours, here's what doesn't work:

  • Subtracting 4 (the offset) from the pointer. It seems like kallsyms_lookup_name+0x4/0xd0 has an offset of 4 so I tried removing that offset but to no avail.
  • Trying a completely different method of capturing kallsyms_lookup_name. I tried using this technique instead but it too yields the same results.
  • Different target symbols. It doesn't seem to be anything special about vm_list.

To my untrained eyes it really seems like there's something wrong with the function pointer, but the other technique gave me the same thing so I'm at a loss for what could be causing this.

wxz
  • 2,254
  • 1
  • 10
  • 31
Tacodiva
  • 391
  • 2
  • 17
  • "From my understanding the function `kallsyms_lookup_name` is not exported by the kernel, but a lot of modules use tricks to access it anyways." - Yes, a lot of modules, which are written "just for study", instead of following guidelines completely **ignore** them and try access a function which is not designed for the modules. And you want to write one more such module... – Tsyvarev May 04 '23 at 12:56
  • If you read the question you would see that I'm actually just trying to get an existing module to work on my computer, and I do not want to rewrite an entire project because one of their dirty tricks happens to work on my computer. I'd rather just try to fix the problem. @Tsyvarev – Tacodiva May 04 '23 at 15:21
  • The question does not make it clear that you are trying to get somebody else's module to work. – Ian Abbott May 05 '23 at 15:07
  • Maybe using [`kallsyms_on_each_symbol`](https://elixir.bootlin.com/linux/latest/ident/kallsyms_on_each_symbol) would help locate `kallsyms_lookup_name` function? – VonC May 06 '23 at 07:28
  • Note that, [as mentioned here](https://stackoverflow.com/a/40513836/6309), From the 5.7 kernel onwards `kallsyms_lookup_name` and `kallsyms_on_each_symbol` are no longer exported as loadable kernel modules. So I suppose your kernel is older than that? – VonC May 06 '23 at 07:32
  • @VonC: The question post is about 6.2 kernel. On that kernel `kallsyms_lookup_name` is not accessible for the modules, but the code doesn't attempt to call it **directly** (by the function's name). For search the function, the code uses kprobe, which is available for any module with GPL-compatible license. – Tsyvarev May 06 '23 at 08:39
  • @Tsyvarev Right, I missed the kernel version in my initial reading. Thank you for the feedback. – VonC May 06 '23 at 08:46
  • 1
    Have you looked into [IBT](https://lwn.net/Articles/889475/) which is what your kernel error is about? ([lines 216-255](https://elixir.bootlin.com/linux/v6.2.13/source/arch/x86/kernel/traps.c#L216)) – wxz May 07 '23 at 03:16
  • @wxz Not too sure how to look into more than the ENDBR64 instruction (`F3 0F 1E FA`) is not present anywhere around `kallsyms_lookup_name`. One line sticks out to me in the link you sent: "It is a common technique for proprietary modules to look up the non-exported functions they need in the kernel's symbol table [...] But, with IBT enabled, any function lacking an endbr instruction will no longer be callable in this way" I'm wondering if this is a problem of ENDBR being excluded as the function is not exported. Not sure why only I would seem to be having this problem, though. – Tacodiva May 07 '23 at 12:04
  • 1
    Since IBT is a security feature, the following is purely for testing: you could try disabling IBT and see if your code works. If it does, then you'll need to study the IBT code and figure out if there's a way around it when it is turned on. – wxz May 07 '23 at 16:14
  • @wxz Yes, thank you! Disabling IBT with the kernel parameter `ibt=off` solved the problem! I'd by happy to award the bounty to this answer. According to [this](https://lpc.events/event/2/contributions/147/attachments/72/83/CET-LPC-2018.pdf) it's possible to prefix indirect branch instructions with `notrack` to avoid the check but I'm not sure how to do that in the module. – Tacodiva May 08 '23 at 01:04

1 Answers1

3

The dmesg you posted showed this error, which gives you the exact line that broke the kernel:

[  +0.000006] ------------[ cut here ]------------
[  +0.000001] kernel BUG at arch/x86/kernel/traps.c:255!
[  +0.000004] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  +0.000002] CPU: 13 PID: 12449 Comm: modprobe Tainted: G     U  W  OE      6.2.13-arch1-1 #1 fa8d27bf98b6325495c4c46c21163862ead484d0
[  +0.000002] Hardware name: Dell Inc. XPS 15 9510/01V4T3, BIOS 1.9.0 03/17/2022
[  +0.000001] RIP: 0010:exc_control_protection+0xc2/0xd0

If you go to your kernel version to find that line you'll see the BUG() came from exc_control_protection() as expected. You'll see above on line 216 that this function is part of an ifdef for the IBT security feature:

#ifdef CONFIG_X86_KERNEL_IBT

Purely for testing purposes, you can try to disable IBT in your kernel to get your code working. But ideally, you should figure out how to get your code to work when IBT is on, since it's meant to protect your computer from certain attacks.

wxz
  • 2,254
  • 1
  • 10
  • 31
  • 1
    It looks like the IBT feature can be disabled with the `ibt=off` command-line option, or turned into a warning with the `ibt=warn` command-line option. – Ian Abbott May 09 '23 at 12:30