3

More overview question than technical. I can see linux kernel developer positions around, and I wonder what would you want to be a kernel module? What kind of tasks are best done as a kernel module compared to using syscalls and doing stuff?

less /proc/modules on my system shows dm_log, a logger for device-mapper. Why would logging be done from kernel, rather then doing it is userspace?

sawdust
  • 16,103
  • 3
  • 40
  • 50
  • Just so you're aware, the device-mapper is a kernel component. So if we're going to export log messages from it, we would have to do it in the kernel (because the kernel is the thing producing the logged messages) – Bill Lynch Oct 08 '14 at 13:58

3 Answers3

4

what would you want to be a kernel module?

Although most people associate kernel modules (only) with device drivers, other kernel services such as filesystems and network protocol handlers can also be built as modules.
The primary rationale for a kernel module versus static linkage (i.e. built in the kernel) is for runtime configurability (which in turn improves memory efficiency). Optional features, services and drivers can be left out of the kernel that is booted, but can still be loaded later when needed.
The overhead of loading the module, the storage space required by the module (the kernel is usually stored in compressed form whereas the module is not compressed) and the fragment of a memory page wasted by each module are usually considered acceptable tradeoffs.

What kind of tasks are best done as a kernel module compared to using syscalls and doing stuff?

This a separate question not really related to the previous. You should actually be comparing user-mode versus kernel-mode. Whether the kernel-mode uses a module (that has to be loaded) or statically linked code (that is always available) is not a significant aspect to the actual question. (The other answer that mentions a "a small performance penalty when running module code due to the virtual memory overhead" is incorrect.)

A user-mode service or driver has the advantages of:

  • Usually easier and quicker to implement (no need to build and install the kernel). Most C programmers learn (only) with the C runtime library, so the kernel environment instead of user-mode could be a learning experience.

  • Easier to control proprietary source code. May be exempt from the GNU GPL.

  • The restricted privileges are less likely to inadvertently take down the system or create a security hole.

A kernel-mode service or driver has the advantages of:

  • Availability to more than one program in the system without messy exclusion locks.

  • Device accessibility can be controlled by file permissions.

  • Status of the device or service is continuous and available as long as the system is running. A user-mode driver would have to reset the device into a known quiescent state everytime the program started.

  • More consistent/accurate timers and reduced event latency.

An example of user-mode versus kernel-mode is the Reliable Datagram Sockets that was added to the Linux kernel in version 2.6.30. Previously RDS was a user-mode protocol based on UDP but enhanced with "acking / windowing / fragmenting / re-ordering, etc". Under heavy loads the timers of the user-mode protocol were not accurate, so additional retransmissions and dropped messages caused stability and performance issues. A switch to a kernel-mode network protocol was intended to improve/solve such issues.

However there are pitfalls (or a responsibility) to kernel mode. Kernel code is responsible for ensuring the integrity and security of the system. RDS was found to introduce a security hole in the kernel.

Why would logging be done from kernel, rather then doing it is userspace?

There could be several reasons, but also for this example it would be likely that the log requestors would be in kernel mode rather than user mode, so this would avoid awkward mode switching.

sawdust
  • 16,103
  • 3
  • 40
  • 50
0

The system call, device driver, /sys and /proc and mmap user API's allow userspace programs only a limited, highly controlled interface to the symbols and data in the kernel. These interfaces do not allow access to much of the data and events that you would want to log using a driver such as dm_log. To export this data to userspace, you need a kernel mode driver, either compiled-in or a loadable kernel module.

In terms of interrupt handling, system calls, device files, the /proc and /sys filesystems, that is, the userspace API, there is no difference between kernel code and module code.

Here are some reasons to use Linux kernel module technology:

  1. Moving code from the kernel to a module reduces the size of the kernel
  2. Reducing the size of the kernel reduces the kernel boot time - the module is loaded during the userspace boot time, or afterwards, if and when required
  3. Module code is loaded into kernel space virtual memory using vmalloc and therefore makes more efficient use of kernel space memory in that the allocated memory is freed if the module is unloaded, whereas the memory for unused drivers compiled into the kernel cannot be freed
  4. Modules allow adding kernel functionality without recompiling, re-installing or even rebooting the kernel. This usually shortens development time.
  5. Modules allow addition of functionality at an arbitrary later time
  6. Moving functionality from the kernel to modules allows flexible provisioning - the same kernel can be used in more types of system, or in the same system that changes over time due to peripheral device changes, with less unnecessary code per system
  7. New functionality can be distributed as a module file that is easier to transport and install

The disadvantages to using modules are:

  1. If you need to write code that executes in a different processor mode, such as ARM FIQ mode, then you need to compile the code separately using a "bare metal" toolchain and embed the compiled code in the module. That bare metal code will not be able to access kernel symbols because it executes in vmalloced memory whose base address is not knowable at the time. OTOH you can write FIQ mode code and compile it into the kernel during the regular kernel build and that code can access kernel symbols when it runs in FIQ mode because virtual and real addresses in the base kernel are the same.
  2. There is a small performance penalty when running module code due to the virtual memory overhead

Regarding userspace drivers, if that was really the intent of the OP - userspace drivers are just that, programs that run in userspace only. As such they have no more access to kernel symbols or memory than any other userspace application, which is to say, not much.

Jonathan Ben-Avraham
  • 4,615
  • 2
  • 34
  • 37
  • 3
    This addresses the reasons to use a module vs static linking into the kernel, but I think the original question was asking for reasons to use a module vs a userspace device driver ("syscalls and doing stuff") – Adrian Cox Oct 08 '14 at 13:32
  • @AdrianCox: The original wording of the OP does not indicate to me an awareness of the userspace driver concept. In any event, I added specific reference to the last part of the OP question regarding `dm_log`. – Jonathan Ben-Avraham Oct 08 '14 at 13:39
  • *"3. Module code is loaded into virtual memory"* -- The entire kernel is in virtual memory. What are you writing about? – sawdust Oct 08 '14 at 19:00
  • @sawdust: I added some clarification to my answer. Please review. The intent was to say that modules are loaded in space that is `vmalloc`ed as opposed to compiled-in drivers whose real and virtual addresses are the same. The fact that base kernel real and virtual addresses are the same is what allows FIQ mode code compiled into the base kernel to access kernel symbols without the complicated run-time fix-up needed when FIQ code is executed from `vmalloc`ed memory. For people who work mostly in kernel space "virtual memory" means `vmalloc`ed memory. Sorry for the mixup. – Jonathan Ben-Avraham Oct 09 '14 at 16:47
  • Your answer is still almost entirely about something which was not asked, with minimal coverage of the actual question topic. – Chris Stratton Oct 09 '14 at 16:50
  • @ChrisStratton: I re-arranged the paragraphs to explain at the begining that you can't get kernel information using the userspace API that you can get using a module or compiled-in driver that has unfettered access to the kernel symbols. Thanks. – Jonathan Ben-Avraham Oct 09 '14 at 17:03
  • *"For people who work mostly in kernel space "virtual memory" means vmalloced memory."* -- Maybe in your shop, because that's news to me. Never heard that meaning/usage/definition for "virtual memory" in a decade's worth of *nix experience, including 5 years of kernel development at a major UNIX company. – sawdust Oct 13 '14 at 18:18
  • @sawdust: Here are several examples of what I intended, where the context of the discussion is the kernel and the term "virtual memory" is used to refer to `vmalloc`ed memory: https://www.kernel.org/doc/gorman/html/understand/understand010.html, http://lkml.iu.edu/hypermail/linux/kernel/0011.0/0205.html, https://groups.google.com/forum/#!topic/linuxkernelnewbies/MBOSFRu5EpE, http://stackoverflow.com/questions/116343/what-is-the-difference-between-vmalloc-and-kmalloc – Jonathan Ben-Avraham Oct 14 '14 at 20:33
  • None of those links seem to use "virtual memory" as you do, but use it in a conventional sense. The third link doesn't even contain the words "virtual memory" in a sentence. Only you seem to insist that "virtual memory" means vmalloced memory. – sawdust Oct 15 '14 at 23:50
-1

Kernel modules are usually implemented for drivers of peripherals that you may not have on your platform. This way, if the peripheral is not available on the platform, you don't consume memory and CPU time for adding unused code.

Claudio
  • 10,614
  • 4
  • 31
  • 71