5

Consider the following code. All of the following was compiled and executed in Visual Studio 2015 using the v140 C++ runtime:

#include <thread>
#include <atomic>
#include <sstream>

#include <Windows.h>

struct StaticThreadTest
{
    ~StaticThreadTest()
    {
        _terminationRequested = true;
        if (_thread.joinable())
            _thread.join();

        std::ostringstream ss;
        ss << "Thread finished gracefully: " << _threadFinishedGracefully << "\n";
        OutputDebugStringA(ss.str().c_str());
    }

    void startThread()
    {
        _thread = std::thread([&]() {
            while (!_terminationRequested);

            _threadFinishedGracefully = true;
        });
    }

    std::thread _thread;
    std::atomic<bool> _terminationRequested {false};
    std::atomic<bool> _threadFinishedGracefully {false};
};

static StaticThreadTest thread;

int main()
{
    thread.startThread();

    return 0;
} 

It works just as expected - "Thread finished gracefully: 1" is printed and the application exits.

However, if I move this code to a DLL (create an empty DLL, export one function from it, place the StaticThreadTest object in the dll's .cpp, call thread.startThread() from the exported function and call this exported function from main.cpp), the code sometimes prints "Thread finished gracefully: 0", but more often than not it just hangs in thread.join().

Is this behavior documented? Is it a bug in the runtime or is this intended?

Of note: the same code but compiled with the v120 toolset hangs 100% of the time even in main.cpp (in the exe). Seems as if there was a bug in the v120 toolset, and in v140 it is fixed for .exe but not for .dll.

Violet Giraffe
  • 32,368
  • 48
  • 194
  • 335
  • 1
    *"Is this behavior documented?"* - Yes, sort of: [Dynamic-Link Library Best Practices](https://msdn.microsoft.com/en-us/library/windows/desktop/dn633971.aspx). – IInspectable Mar 01 '17 at 15:58
  • 2
    "The Old New Thing" has a small series on DLL shutdown, in summary from the point when DllMain is called during DLL shutdown (which is before statics are destroyed) all bets are off as as many OS calls have been "electrified" and will fail in unexpected ways. Found link https://blogs.msdn.microsoft.com/oldnewthing/20100122-00/?p=15193 – Richard Critten Mar 01 '17 at 16:05
  • did you mark the class as DLLExport? – Richard Hodges Mar 01 '17 at 16:06
  • @RichardHodges: No class members or object references are accessed from outside the DLL. There is no need to export the class or class members. – IInspectable Mar 01 '17 at 16:30
  • @RichardHodges: nope, there's no need to. – Violet Giraffe Mar 01 '17 at 17:05
  • @RichardCritten: thanks for the link and the explanation. – Violet Giraffe Mar 01 '17 at 17:06
  • What about the constructor and destructor? Aren't they called from the main module. – Richard Hodges Mar 01 '17 at 17:42
  • See http://stackoverflow.com/questions/10915233/stdthreadjoin-hangs-if-called-after-main-exits-when-using-vs2012-rc, most likely the same - the dtor of `StaticThreadTest` is called when the dll is unloaded which might happen after `main` exits. – Rudolfs Bundulis Mar 01 '17 at 23:43
  • In most cases, the best way to avoid the whole sticky mess is to avoid static objects. Have an explicit initialization routine in your DLL and create your global objects there; have an explicit uninitialization routine as well to destroy them. – Harry Johnston Mar 02 '17 at 02:30
  • @HarryJohnston: It's obvious, but very inconvenient for the DLL clients. – Violet Giraffe Mar 02 '17 at 05:46
  • @RudolfsBundulis: It _should_ be called after `main` exits! I don't see why that must cause trouble with threads. – Violet Giraffe Mar 02 '17 at 05:47
  • @VioletGiraffe: see the bug in Microsoft Connect in the SO link - in the case a thread is joined after main exits you hit a bug in the runtime. As simple as that. – Rudolfs Bundulis Mar 02 '17 at 08:58
  • @RudolfsBundulis: It's curious that it occurs when the thread is hosted in a DLL but not in .exe as long as you're using v140 runtime. I wonder if they fixed it for .exe, or if it's just a coincidence. – Violet Giraffe Mar 02 '17 at 09:25
  • @VioletGiraffe well, imho, since they are advising that these things should not be done inside `DllMain`, and `DllMain` is from where the dtors of static objects are invoked, I think it is kind of a no mans land. I mean there is no fixed term here imho - they tell that noone should do this, they know there are issues, they may fix some (since we see that it has improved from v120), but there are no guarantees. – Rudolfs Bundulis Mar 02 '17 at 12:39
  • @RudolfsBundulis: you're right, I understand. It's just annoying because this issue makes using certain libraries more cumbersome (the client needs to explicitly call a deinitializer function). – Violet Giraffe Mar 02 '17 at 12:56
  • @VioletGiraffe I wonder if this is still something else, I put together an example with a different way of termination (conditional variable + mutex) and I can't get that to hang. I can put it on github if you like, and you can check if you can make that hang. Maybe this is still something else? – Rudolfs Bundulis Mar 02 '17 at 13:06
  • @RudolfsBundulis: sure, share a link to gist. Does my code hang on your machine? – Violet Giraffe Mar 02 '17 at 13:09
  • @VioletGiraffe first time I use Gist, hope I do this right :D here goes: https://gist.github.com/rubu/315725af4fa694cdb79f7517e6812451 . I'll try the atomic stuff in a sec. – Rudolfs Bundulis Mar 02 '17 at 13:14
  • @VioletGiraffe actually I can't get it to hang with your code. – Rudolfs Bundulis Mar 02 '17 at 13:25
  • @VioletGiraffe oh, then I have possibly solved this:D But I'll still try to make it hang with the atomic stuff, since now I'm already curious. – Rudolfs Bundulis Mar 02 '17 at 13:46
  • @VioletGiraffe, if the library *must* uninitialize cleanly on program exit (in most cases this is just a waste of time) can't the *client* have a static global to deal with the initialization and uninitialization? – Harry Johnston Mar 02 '17 at 21:20
  • I'm not sure how relevant this is to your scenario, but I discuss some possible workarounds [here](http://stackoverflow.com/a/35833437/886887). – Harry Johnston Mar 02 '17 at 21:22

1 Answers1

1

It seems that using a different synchronisation mechanism instead of a busy loop (std::mutex + std::condition_variable) eliminates this.

The following example demonstrates both mechanisms:

#include "thread-test-dll.h"

#include <thread>
#include <mutex>
#include <condition_variable>
#include <iostream>
//#define _BUSY_WAIT
#ifdef _BUSY_WAIT
#include <atomic>
#endif

class foo
{
public:
    foo() 
#ifdef _BUSY_WAIT
        : termination_requested_(false),
        terminated_gracefully_(false)
#endif
    {
        std::cout << "foo::foo()...\n";
    }

    void run()
    {
#define _USE_LAMBDA
#ifdef _USE_LAMBDA
        thread_ = std::thread([&]() {
            work();
        });
#else
        thread_ = std::thread(&foo::work, this);
#endif
    }

    ~foo()
    {
        std::cout << "foo::~foo()...\n";
        if (thread_.joinable())
        {
#ifdef _BUSY_WAIT
            termination_requested_ = true;
#else
            condition_variable_.notify_all();
#endif
            thread_.join();
            std::cout << "thread joined...\n";
        }
        else
        {
            std::cout << "thread was not joinable...\n";
        }
#ifdef _BUSY_WAIT
        std::cout << "terminated_gracefully_ = " << terminated_gracefully_ << "\n";
#endif
    }

    void work()
    {
#ifdef _BUSY_WAIT
        while (!termination_requested_);
        terminated_gracefully_ = true;
#else
        std::unique_lock<std::mutex> mutex_lock(mutex_);
        condition_variable_.wait(mutex_lock);
#endif
        std::cout << "foo:work() terminating...\n";
    }

private:
#ifdef _BUSY_WAIT
    std::atomic<bool> termination_requested_;
    std::atomic<bool> terminated_gracefully_;
#endif
    std::thread thread_;
    std::mutex mutex_;
    std::condition_variable condition_variable_;
};

static foo instance;

void runThread()
{
    instance.run();
}

But I'll still look into if I can make it hang with std::atomic since if that is the cause it is still a different issue than the one on calling std::thread::join() after main has exited.

Rudolfs Bundulis
  • 11,636
  • 6
  • 33
  • 71
  • Hm. I was, in fact, using a condition variable in the original code where I first encountered the issue. I only came up with the busy wait while working on an MRCE. – Violet Giraffe Mar 02 '17 at 14:05
  • 1
    @VioletGiraffe there is something weird in this, when I added more logging I actually noticed that the thread is not terminated in a gracefull manner, `join()` just returns and that is it. So I'll keep debugging. – Rudolfs Bundulis Mar 02 '17 at 14:06
  • It shouldn't matter what mechanism you use to wait for the thread; if the destructor is being run inside DllMain, you hold a lock that the thread needs in order to exit. The only two relevant factors are whether the destructor is in fact being run inside DllMain and (if so) whether the thread had already exited by the time DllMain was started. (I'm not sure what the exact rules are for when a destructor is run inside DllMain.) – Harry Johnston Mar 02 '17 at 21:16