0

I am converting my user application into a kernel module. The kernel module uses 200M (all the pieces of memory are got through vmalloc()). This memory is used for recursive hash tables. I have used a recursive function to delete hash table. The number of recursive calls can go in thousands(since the total memory used is 200M and every single hash table's size is just 256*8 ).

The problem is, when I delete hash tables using a recursive function, the kernel crashes sometimes. The same application when run in user space, doesn't crash and runs perfectly. After doing some googling, I think this could be due to overflow of stack while doing recursion. To confirm/isolate the issue, is there any way to view the current stack usage of the kernel module/process?

  • 8
    Making thousands of recursive calls in kernel-space doesn't sound like a good design-decision to me. Does it really *need* to be in kernel-space? What is the use-case? What is the original problem you are trying to solve? Can't you split it up more, so the recursion takes place in user-space and only the bare minimum is in kernel space? – Some programmer dude May 24 '16 at 06:29
  • The application is to analyze ethernet packets. The requirement is that the entire application should a kernel-module which will be linked with an ethernet driver so that we can analyze packets at line rate. To analyze, packets are stored in recursive hash-tables. BTW, can you please explain why recursion in kernel is not a good design. – Karthik Raj Palanichamy May 24 '16 at 09:34

1 Answers1

3

There's a debug option called CONFIG_DEBUG_STACKOVERFLOW which might help you catch stack overflows before they bite you and crash your kernel.

However, this is implemented in an arch-specific way inside the interrupt handling. As v4.6, the architectures that support this are:

$ git grep "select HAVE_DEBUG_STACKOVERFLOW" v4.6
v4.6:arch/arc/Kconfig:  select HAVE_DEBUG_STACKOVERFLOW
v4.6:arch/blackfin/Kconfig:     select HAVE_DEBUG_STACKOVERFLOW
v4.6:arch/frv/Kconfig:  select HAVE_DEBUG_STACKOVERFLOW
v4.6:arch/m32r/Kconfig: select HAVE_DEBUG_STACKOVERFLOW
v4.6:arch/metag/Kconfig:        select HAVE_DEBUG_STACKOVERFLOW
v4.6:arch/mips/Kconfig: select HAVE_DEBUG_STACKOVERFLOW
v4.6:arch/mn10300/Kconfig:      select HAVE_DEBUG_STACKOVERFLOW
v4.6:arch/openrisc/Kconfig:     select HAVE_DEBUG_STACKOVERFLOW
v4.6:arch/parisc/Kconfig:       select HAVE_DEBUG_STACKOVERFLOW
v4.6:arch/powerpc/Kconfig:      select HAVE_DEBUG_STACKOVERFLOW
v4.6:arch/tile/Kconfig: select HAVE_DEBUG_STACKOVERFLOW
v4.6:arch/x86/Kconfig:  select HAVE_DEBUG_STACKOVERFLOW

Notably, ARM is not on that list.

It is interesting to see how simple it is to check the current stack depth:

/* Debugging check for stack overflow: is there less than 1KB free? */
static int check_stack_overflow(void)
{
        long sp;

        __asm__ __volatile__("andl %%esp,%0" :
                             "=r" (sp) : "0" (THREAD_SIZE - 1));

        return sp < (sizeof(struct thread_info) + STACK_WARN);
}

Now, regarding your question about recursion: it is a poor design in kernel development, for precisely the issue you are facing. In kernel land, the stack is fixed and won't grow. Recursion is usually stack-hungry and will lead you to stack overflowing easily.

Keep in mind that theoretically you can always convert a recursive algorithm to an iterative one, as explained here: Can every recursion be converted into iteration?

Community
  • 1
  • 1
Ezequiel Garcia
  • 1,037
  • 8
  • 20