What happens to a detached thread inside a forked process when the process dies?

Question

My program forks everytime it has to deal with something, and in every forked process I detach a thread in order to log stats from the forked process: this thread loops to collect data, but it has no actual condition to stop this loop.

I read in "What happens to a detached thread when main() exits?" that:

As already stated, any thread, whether detached or not, will die with its process on most OSes.

In my program I provide no stopping condition to the looping thread, since when the process that spawned it will die, the detached thread will die with it. Anyway I felt like I was taking for granted something, so I made the following code to simplify my doubt and exclude the superfluous from my original program.

In this code, every forked process spawns a thread which will print some numbers.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

#include <thread>

void threadFoo(int id) {
    int i=0;

    // this loop will simulate some stas collecting
    while (i<1000000) {
        printf("[%d]%d \t", id, i);
        ++i;
    }
    printf("\n\n\n");
    return;
}

void forkFoo(int id) {
    std::thread t(threadFoo, id);
    t.detach();
    printf("PID %d detached thread\n", getpid());
    return;
}


int main(void) {

    int i;
    pid_t pid;

    for (i=0; i<3; i++) {
        pid = fork();
        if (pid == 0) {
            forkFoo(i);
            // this sleep will simulate some work
            sleep(1);
            printf("Proc %d about to terminate...even its detached thread?\n");
            _exit(EXIT_SUCCESS);
        }
        else if(pid > 0) {
            // wait for all children to terminate
            wait(NULL);
        }
    }

    printf("main() about to terminate...\n");
}

The output of the program confirmed that every thread dies with its process

PID 13476 detached thread
[0]0    [0]1    [0]2    [0]3  ...
... [0]48940    [0]48941    Proc 13476 about to terminate...even its detached thread?
PID 13478 detached thread
[1]0    [1]1    [1]2    [1]3 ... [1]42395   [1]42396    Proc 13478 about to terminate...even its detached thread?
PID 13480 detached thread
[2]0    [2]1    [2]2    [2]3 ...
... [2]41664    [2]41665    Proc 13480 about to terminate...even its detached thread?
main() about to terminate...

Some doubts were raised when I ran this program with valgrind --leak-check=full --show-leak-kinds=all: when every forked process dies, valgrind shows some creepy output (13534 is forked process PID):

==13534== HEAP SUMMARY:
==13534==     in use at exit: 352 bytes in 2 blocks
==13534==   total heap usage: 2 allocs, 0 frees, 352 bytes allocated
==13534== 
==13534== 64 bytes in 1 blocks are still reachable in loss record 1 of 2
==13534==    at 0x4C2B145: operator new(unsigned long) (vg_replace_malloc.c:333)
==13534==    by 0x401DB5: __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> >, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > >, (__gnu_cxx::_Lock_policy)2> >::allocate(unsigned long, void const*) (new_allocator.h:104)
==13534==    by 0x401CE1: std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> >, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > >, (__gnu_cxx::_Lock_policy)2> > >::allocate(std::allocator<std::_Sp_counted_ptr_inplace<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> >, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > >, (__gnu_cxx::_Lock_policy)2> >&, unsigned long) (alloc_traits.h:351)
==13534==    by 0x401B41: std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> >, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > >, std::_Bind_simple<void (*(int))(int)> >(std::_Sp_make_shared_tag, std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> >*, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > > const&, std::_Bind_simple<void (*(int))(int)>&&) (shared_ptr_base.h:499)
==13534==    by 0x401A8B: std::__shared_ptr<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> >, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > >, std::_Bind_simple<void (*(int))(int)> >(std::_Sp_make_shared_tag, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > > const&, std::_Bind_simple<void (*(int))(int)>&&) (shared_ptr_base.h:957)
==13534==    by 0x401A35: std::shared_ptr<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > >::shared_ptr<std::allocator<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > >, std::_Bind_simple<void (*(int))(int)> >(std::_Sp_make_shared_tag, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > > const&, std::_Bind_simple<void (*(int))(int)>&&) (shared_ptr.h:316)
==13534==    by 0x4019A9: std::shared_ptr<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > > std::allocate_shared<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> >, std::allocator<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > >, std::_Bind_simple<void (*(int))(int)> >(std::allocator<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > > const&, std::_Bind_simple<void (*(int))(int)>&&) (shared_ptr.h:598)
==13534==    by 0x401847: std::shared_ptr<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > > std::make_shared<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> >, std::_Bind_simple<void (*(int))(int)> >(std::_Bind_simple<void (*(int))(int)>&&) (shared_ptr.h:614)
==13534==    by 0x401621: std::shared_ptr<std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> > > std::thread::_M_make_routine<std::_Bind_simple<void (*(int))(int)> >(std::_Bind_simple<void (*(int))(int)>&&) (thread:193)
==13534==    by 0x4012AB: std::thread::thread<void (&)(int), int&>(void (&)(int), int&) (thread:135)
==13534==    by 0x400F42: forkFoo(int) (funwiththreadinsidefork.cpp:21)
==13534==    by 0x400FBD: main (funwiththreadinsidefork.cpp:36)
==13534== 
==13534== 288 bytes in 1 blocks are possibly lost in loss record 2 of 2
==13534==    at 0x4C2C9B4: calloc (vg_replace_malloc.c:711)
==13534==    by 0x4012E14: allocate_dtv (dl-tls.c:296)
==13534==    by 0x4012E14: _dl_allocate_tls (dl-tls.c:460)
==13534==    by 0x5359D92: allocate_stack (allocatestack.c:589)
==13534==    by 0x5359D92: pthread_create@@GLIBC_2.2.5 (pthread_create.c:500)
==13534==    by 0x4EE8CAE: std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19)
==13534==    by 0x4012D1: std::thread::thread<void (&)(int), int&>(void (&)(int), int&) (thread:135)
==13534==    by 0x400F42: forkFoo(int) (funwiththreadinsidefork.cpp:21)
==13534==    by 0x400FBD: main (funwiththreadinsidefork.cpp:36)
==13534== 
==13534== LEAK SUMMARY:
==13534==    definitely lost: 0 bytes in 0 blocks
==13534==    indirectly lost: 0 bytes in 0 blocks
==13534==      possibly lost: 288 bytes in 1 blocks
==13534==    still reachable: 64 bytes in 1 blocks
==13534==         suppressed: 0 bytes in 0 blocks

Same error (warning?) message for every forked process when it dies.

The final output is about the main() process, PID 13533:

==13533== HEAP SUMMARY:
==13533==     in use at exit: 0 bytes in 0 blocks
==13533==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==13533== 
==13533== All heap blocks were freed -- no leaks are possible
==13533== 
==13533== For counts of detected and suppressed errors, rerun with: -v
==13533== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

I don't know how to read all this valgrind output and I don't know if my way of handling a detach thread is right: I'm using C++11 and since it is not provided with a garbage collector, I don't know if those possibly lost and still reachable bytes may degrade my program's performance; I fork() quite often (even if every forked process has a lifetime of some seconds) and every forked process spawn a detached thread that logs some stats. When the forked process dies, the thread dies with it, but I don't know if in the long run my program may slow down because of those bytes that valgrind shows me.

In your opinion, is my concern justified? Am I handling rigth the death of detached threads inside forked processes?

score 3 · Accepted Answer · answered Sep 05 '16 at 00:19

When you call std::thread::detach it does not decouple the thread from your process, it simply decouples the std::thread instance from the thread. It's stack is allocated from the process' memory, it's still sharing memory and resources with the process: when the process stops, it takes the thread out with it.

And it's not done gracefully, it doesn't have any of it's destructors called or even it's stack deallocated (which is why you are seeing a leak).

#include <iostream>
#include <thread>
#include <mutex>
#include <chrono>
#include <atomic>

struct OnExit
{
    const char* id = "none";
    ~OnExit()
    {
        std::cout << "Exiting " << id << std::endl;
    }
};

thread_local OnExit onExit;

void threadFn1()
{
    onExit.id = "threadFn1";
    for (size_t i = 0; i < 100000; ++i) {
        std::cout << onExit.id << std::endl;
        std::this_thread::sleep_for(std::chrono::microseconds(50));
    }
}

std::atomic<bool> g_running { true };

void threadFn2()
{
    onExit.id = "threadFn2";
    while (g_running) {
        std::cout << onExit.id << std::endl;
        std::this_thread::sleep_for(std::chrono::microseconds(50));
    }
}

int main()
{
    std::thread t1(threadFn1);
    std::cout << "started t1\n";
    t1.detach();
    std::cout << "detached t1\n";

    std::thread t2(threadFn2);
    std::cout << "started t2\n";
    std::this_thread::sleep_for(std::chrono::microseconds(500));

    std::cout << "ending\n";
    g_running = false;
    t2.join();
}

Live demo: http://coliru.stacked-crooked.com/a/aa775a2960db09db

Output

started t1
detached t1
started t2
threadFn2
threadFn1
threadFn2
threadFn2
ending
Exiting threadFn2

Because we self-terminate threadFn2, it gets to call the OnExit dtor, but threadFn1 is terminated brutally.

It works even with detached thread: http://coliru.stacked-crooked.com/a/bfa2b9bd4ddda01b But, isn't the continuous checking of a flag something that keeps the CPU always busy, busy waiting? — elmazzun, Sep 05 '16 at 09:43
@elmazzun you only have to check it at the start of whatever loop your code is running, so it's just adding one instruction to your existing workload. — kfsone, Sep 05 '16 at 19:20

score 0 · Answer 2 · answered Sep 04 '16 at 17:23

0

"What happens to a detached thread inside a forked process when the process dies?" - the thread evaporates. The thread lives within the process. When the process dies, so does the thread(s) within it.

answered Sep 04 '16 at 17:23

Jesper Juhl

30,449
3
47
70

Shloim · Answer 3 · 2016-09-04T18:35:58.857

0

The thread dies. The leak is due to the fact the the thread did not complete its run and did not free its own resources. If your sleep in the main thread would have been longer then there won't be any leak.

Edit after reading your comments regarding pcap:

Proper way: Don't detach. give the main thread access to all the pcap handles. When the main wants to stop it closes all pcaps and joins all threads. When the threads get a pcap error they exit. If done right you would get no leaks and a nice and clean termination.

edited Sep 04 '16 at 18:35

answered Sep 04 '16 at 17:26

Shloim

5,281
21
36

The problem is: the detached thread will never know when it finishes its loop, and when the resources will be free/deallocated, since the loop will be brutally stopped when the process that detached it will die. – elmazzun Sep 04 '16 at 17:42
Exactly. If it's ok by you that your thread is killed mid-process and if it doesn't do anything that might risk your system if it is stopped mid-run, then you're ok. But that's a bad design. The proper way would be to signal the thread somehow to stop (raise a flag) and wait for it to comply (by not detaching and calling join on it) – Shloim Sep 04 '16 at 18:31

score 0 · Answer 4 · answered Sep 04 '16 at 17:30

0

The process is the element that keeps your thread(s) around. When the process goes, the threads follow. The only question is if your threads acquired system-wide resources that require to be released in any case.

answered Sep 04 '16 at 17:30

GhostCat

137,827
25
176
248

Every thread start a `pcap` session: in this session, no system-wide resources are modified: so, in my real program, the leak would be caused by all the `pcap` file descriptor and structures, some `sockaddr_in` and others structures; they appear innocent, but what would happen after a great number of detached threads killed brutally? Lots of leaks and wasted memory? – elmazzun Sep 04 '16 at 17:41
Proper way: Don't detach. give the main thread access to all the pcap handles. When the main wants to stop it closes all pcaps and joins all threads. When the threads get a pcap error they exit. If done right you would get no leaks and a nice and clean termination. – Shloim Sep 04 '16 at 18:34
@Shloim the one who destroy the handle should be the one who created it. And if it is the main thread, he should join first then close the handles, not the other way around. – UmNyobe Sep 05 '16 at 08:49
Not necessarily. Let's say that thread1 created a socket, connected it and reads data off it. When the main thread wants to kill it it should call `shoutdown()` on thread1's socket. That will wake thread1 up and will let it end gracefully. Of course, thread1 should create an interface to "interrupt" it correctly. – Shloim Sep 05 '16 at 09:28

What happens to a detached thread inside a forked process when the process dies?

4 Answers4