18

I've seen bits of scattered information all around, but I can't seem to get to one final answer. How do you clean up a zombie thread in kernel?

Just to make sure, and produce a final correct way of handling threads in kernel, I would like to ask this question more broadly. How do you create, terminate and clean up a thread in the Linux kernel?

What I have so far is this:

thread_func:
    exited = 0;
    while (!must_exit)
        do stuff
    exited = 1;
    do_exit(0)

init_module:
    must_exit = 0;
    exited = 1;
    kthread_run(thread_func, ...)    /* creates and runs the thread */

cleanup_module:
    must_exit = 1;
    while (!exited)
        set_current_state(TASK_INTERRUPTIBLE);
        msleep(1);
    /* How do I cleanup? */

The closest thing I have found to the cleanup solution is release_task, but I didn't find anywhere talking about it. I imagined since the thread functions are kthread_create, kthread_run etc, there should be a kthread_join or kthread_wait, but there wasn't. do_wait also seemed likely, but it doesn't take a struct task_struct *.

Furthermore, I am not sure if do_exit is a good idea, or if at all necessary either. Can someone please come up with the minimum sketch of how a kthread should be created, terminated and cleaned up?

Shahbaz
  • 46,337
  • 19
  • 116
  • 182
  • 2
    I seem to remember that there is a kthread_stop, or kthread_should_stop, something like that. – Martin James Apr 16 '12 at 16:22
  • @MartinJames, the way I understood, you either exit yourself (using `do_exit`) or poll `kthread_should_stop` until someone (`cleanup_module`) calls `kthread_stop`. I didn't find anywhere saying whether `kthread_stop` also cleans up the thread or not. What makes me wonder is that, if people (on the internet) suggest using either `do_exit` or whatever, shouldn't there be a way to cleanup the thread after `do_exit`? – Shahbaz Apr 16 '12 at 16:44
  • By the way, [this](http://lwn.net/Articles/65178/) is what I talk about when I say I can't reach a conclusive answer. There are a lot of conflicting stuff out there. – Shahbaz Apr 16 '12 at 16:52

1 Answers1

19

One of the "right" ways to do this is to have your thread function check if it kthread_should_stop, and simply return if it does need to stop.

You don't need to call do_exit, and if you intend to kthread_stop it from the module exit function, you probably shouldn't.

You can see this by looking at the documentation for kthread_create_on_node in kernel/kthread.c (extract from Linux kernel 3.3.1):

/**
* kthread_create_on_node - create a kthread.
* @threadfn: the function to run until signal_pending(current).
* @data: data ptr for @threadfn.
* @node: memory node number.
* @namefmt: printf-style name for the thread.
*
* Description: This helper function creates and names a kernel
* thread. The thread will be stopped: use wake_up_process() to start
* it. See also kthread_run().
*
* If thread is going to be bound on a particular cpu, give its node
* in @node, to get NUMA affinity for kthread stack, or else give -1.
* When woken, the thread will run @threadfn() with @data as its
* argument. @threadfn() can either call do_exit() directly if it is a
* standalone thread for which no one will call kthread_stop(), or
* return when 'kthread_should_stop()' is true (which means
* kthread_stop() has been called). The return value should be zero
* or a negative error number; it will be passed to kthread_stop().
*
* Returns a task_struct or ERR_PTR(-ENOMEM).
*/

A "matching" comment is present for kthread_stop:

If threadfn() may call do_exit() itself, the caller must ensure task_struct can't go away.

(And I'm not sure how you do that - probably holding on to the struct_task with a get_task_struct.)

If you walk the path of a thread creation you'll get something like:

kthread_create                                           // macro in kthread.h
  -> kthread_create_on_node                              // in kthead.c
    -> adds your thread request to kthread_create_list
    -> wakes up the kthreadd_task

kthreadd_task is set up in init/main.c in reset_init. It runs the kthreadd function (from kthread.c)

kthreadd                                                 // all in kthread.c
  -> create_kthread
    -> kernel_thread(kthread, your_kthread_create_info, ...)

And the kthread function itself does:

kthread
  -> initialization stuff
  -> schedule() // allows you to cancel the thread before it's actually started
  -> if (!should_stop)
    -> ret = your_thread_function()
  -> do_exit(ret)

... So if your_thread_function simply returns, do_exit will be called with its return value. No need to do it yourself.

Mat
  • 202,337
  • 40
  • 393
  • 406
  • Well, the task struct is a global variable, so it can't go anywhere. But, does this mean if the standalone thread calls `do_exit()` (and therefore shouldn't call `kthread_stop`) it wouldn't need cleanup? – Shahbaz Apr 16 '12 at 17:31
  • It can go places. If the task to which that task struct refers is completely done, and the task struct is freed by the exit path, the copy you have in your module data is just like a dangling pointer - you can't use it. – Mat Apr 16 '12 at 17:40
  • And yes, if you don't intend to `kthread_stop` your thread, it can call `do_exit` and normal cleanup will happen. _But_ if somehow your thread manages to outlive your module, you're in trouble. – Mat Apr 16 '12 at 17:44
  • Alright, if the cleanup is done, it's fine. I have made sure in `cleanup_module` to wait until the threads have returned. I have been bitten by it already. – Shahbaz Apr 16 '12 at 18:00
  • Just to make sure, `kthread_stop` does wait for the thread to complete, too. It's a pretty nice wrapper. But I guess it might not be well suited to your use-case. – Mat Apr 16 '12 at 18:03
  • Ouch! Why wouldn't it wait? Doesn't the `wait_for_completion` do that? – Shahbaz Apr 16 '12 at 18:06