1

Some time ago, I developed a kernel module for ARMv7 (Cortex-A5). This module worked fine, but I needed to add a feature to it. Unfortunately, the machine that had the cross-compiler toolchain installed got repurposed in the mean time, so I had to set it up again. Of course, I found that I hadn't documented all the details, and I have been struggling to get a module that would actually load. After a day or so, I managed to compile something that was accepted by insmod on the target.

But the module does nothing.

insmod does not give me any errors. dmesg doesn't show anything, although the module is supposed to print some info from its probe routine. Also, the character device it's supposed to create does not exist, obviously the module's probe routine didn't run.

If I do modinfo on both the old .ko (which I still have) and new .ko, there are no differences. file also shows the same output, except for the sha1 checksum, which is to be expected:

module_old.ko: ELF 32-bit LSB relocatable, ARM, EABI5 version 1 (SYSV), BuildID[sha1]=1f55850e8da4b3b5931536060d62193d94730cf6, not stripped
module_new.ko: ELF 32-bit LSB relocatable, ARM, EABI5 version 1 (SYSV), BuildID[sha1]=c3b1f6fdcb72381beb7b8a766c70af7a1252a78f, not stripped

The only difference I see is that the new .ko is about 10x bigger than the old one:

-rw-r--r-- 1 root root 154504 Apr 21 23:38 module_new.ko
-rw-rw-r-- 1 root root  17956 Oct  4  2017 module_old.ko

My Makefile is as follows:

obj-m += mymodule.o

KERNEL_SOURCE_DIR = /home/ludo/linux-at91-linux4sam_5.3

all:
        make -C $(KERNEL_SOURCE_DIR)  CROSS_COMPILE=arm-linux-gnueabihf- ARCH=arm M=$(PWD) modules

clean:
        make -C $(KERNEL_SOURCE_DIR)  CROSS_COMPILE=arm-linux-gnueabihf- ARCH=arm M=$(PWD) clean

I tried adding

CFLAGS_mymodule.o := -march=armv7-a -mtune=cortex-a5

but this made no difference.

I'm building on Debian Jessie (8.10) with gcc 4.9. I do not recall exactly with what GCC version I built the old version, but it was 4.x, not newer.

Any ideas how I can debug this problem?

sawdust
  • 16,103
  • 3
  • 40
  • 50
Ludo
  • 813
  • 1
  • 9
  • 21
  • First verify that the Device Tree that you're booting with (i.e. **/proc/device-tree/**) actually references this driver. See [Driver code in kernel module doesn't execute?](https://stackoverflow.com/questions/26840267/driver-code-in-kernel-module-doesnt-execute) – sawdust Dec 16 '17 at 02:24
  • I actually haven't changed anything to the driver yet. So `module_new.ko` and `module_old.ko` are from the exact same source file, but `module_new.ko` is built with the new cross-compiler toolchain (and doesn't work), while `module_old.ko` was built with the old toolchain (that I don't have anymore). I'm testing both on the same board with the same DT. There must be something wrong in the toolchain, since that is the only difference... – Ludo Dec 16 '17 at 21:55
  • 1
    Variables (built-in defines) will change when updating the compiler. The code is **NOT** the same. The kernel has many macros that are intimately familiar with GCC versions and features. Many compilers are known to break the kernel; it can be argued ad-nausea who's issue this is. However, the fact is that changing compiler will change the module code/binary. The 'toolchain' maybe an issue or you might have an issue with kernel/toolchain compatibility. – artless noise Dec 18 '17 at 15:10
  • @artlessnoise If that's true, then does that imply that distributing my source code is pointless, unless I test it against all possible compiler versions? I had always assumed, apparently naively, that if I had the correct kernel sources, C-files and Makefile, I should be able to compile the module (after setting some obvious CONFIG defines, like SMP, mod_versions, etc.). How can I distribute driver sources? – Ludo Dec 20 '17 at 13:26
  • 1
    You can distribute the source, but it is typical to test the compiler version. The Kbuild/kconfig should be testing your compiler so a known bad compiler will be flagged. However, they generally don't test every bad compiler in the universe, just known bad ones or really old ones. Since the module is not 'in-tree', it is possible you broke some rules and some change due to compiler defines affected things. I am just saying that is **possible** and **not definitively** your problem. Just keep it in mind... – artless noise Dec 20 '17 at 16:08
  • 1
    Oh, you may need to learn how to use debugging and tracing tools and other means in Linux kernel. For example, *initcall_debug* in the command line will show you what modules were initialized and how it is going so far. *ignore_loglevel* allows to see all messages on the screen independently on level. And so on, so on... To the size, you need to learn what `strip` means and how it's being used. – 0andriy Dec 29 '17 at 13:06
  • @artlessnoise : your remark pushed me in the right direction. I didn't document that one needs to run defconfig for the device (see my answer below). Everything appears to be good now. – Ludo Jan 08 '18 at 10:05

1 Answers1

0

I forgot to document a very important step in the build process. Before building the module, one needs to ensure that the configuration is consistent. This can be done by executing, from the root of the kernel source tree:

make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- sama5_defconfig
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- modules

Then the module can be compiled and all is good.

Ludo
  • 813
  • 1
  • 9
  • 21