4

I have the following class which is being compiled in Visual Studio 2015:

class MitigationPolicyChecker
{
public:
    static auto& getInstance()
    {
        static MitigationPolicyChecker instance;
        return instance;
    }

    MitigationPolicyChecker(MitigationPolicyChecker const&) = delete;
    MitigationPolicyChecker(MitigationPolicyChecker&&) = delete;
    MitigationPolicyChecker& operator=(MitigationPolicyChecker const&) = delete;
    MitigationPolicyChecker& operator=(MitigationPolicyChecker&&) = delete;

    bool IsDynamicCodeProhibited();

private:
    MitigationPolicyChecker() = default;
    ~MitigationPolicyChecker() = default;
    void GetDynamicCodePolicy();

    bool m_bDynamicCodeProhibited = false;
};

The class is designed to run from DllMain and checks for a structure in NtDll's loaded image to check if a certain mitigation policy is enabled.

In rare cases (<0.0001% of times the dll is loaded) this singleton is causing the app to crash. However, the crash isn't related to what the class is trying to do or to the well known list of things which aren't permitted in DllMain (as far as I can tell). It's related to the instantiation of the singleton its self.

The call stack in WinDbg looks like this:

00 OurDll!MitigationPolicyChecker::getInstance+0x1a
02 OurDll!Initialize+0x79
03 OurDll!DllMain+0x1c9
04 OurDll!dllmain_dispatch+0x74
05 ntdll!LdrpRunInitializeRoutines+0x1fe
06 ntdll!LdrGetProcedureAddressEx+0x2aa
07 ntdll!LdrpCorInitialize+0x1a1
08 ntdll!LdrpInitializeProcess+0x1816
09 ntdll! ?? ::FNODOBFM::`string'+0x22790
0a ntdll!LdrInitializeThunk+0xe

An access violation occurs on the line static MitigationPolicyChecker instance

And the crashing assembly line is the last line here, seemingly rdx and rcx were both null according to the exception record (Possibly a failure of thread local storage??):

0:000> uf OurDll!MitigationPolicyChecker::getInstance
OurDll!MitigationPolicyChecker::getInstance:
    8 000007fe`fbc38ed0 4883ec28           sub     rsp,28h
    9 000007fe`fbc38ed4 b804000000         mov     eax,4
    9 000007fe`fbc38ed9 8bc0               mov     eax,eax
    9 000007fe`fbc38edb 8b0dcf850b00       mov     ecx,dword ptr [OurDll!_tls_index (000007fe`fbcf14b0)]
    9 000007fe`fbc38ee1 65488b142558000000 mov   rdx,qword ptr gs:[58h]
    9 000007fe`fbc38eea 488b0cca           mov     rcx,qword ptr [rdx+rcx*8]

So what's going on here? The process has just one thread active at the time of the crash so I don't believe this is a concurrency issue. As far as I'm aware:

  1. The static object should be created on first access.
  2. Statics would ordinarily be initialized just prior to DllMain but in this case since it's a function level static it'll be created when its called and since it's VS2015 it'll be created as a "magic static".

Is there something I'm missing that makes this unsafe?

Benj
  • 31,668
  • 17
  • 78
  • 127
  • Thread safe initialisation of `static MitigationPolicyChecker instance;` will require a lock, and locks during DllMain are on the non-permitted list. Get an assembly listing of `getInstance` check how the initialisation is done. – Richard Critten Oct 03 '19 at 15:42
  • @RichardCritten Critical sections are permitted inside dllmain since they're available via kernel32.dll. https://learn.microsoft.com/en-us/windows/win32/dlls/dllmain "For example, DllMain can create synchronization objects such as critical sections and mutexes, and use TLS" – Benj Oct 03 '19 at 15:45
  • @RichardCritten The assembly is in the post actually and the way VS2015 does it is discussed here: https://www.nullptr.me/2017/06/20/cpp11-threadsafe-statics/. Interestingly my code crashes just before the compiler generated _Init_thread_header function gets called. That would ordinarily acquire a critical section. – Benj Oct 03 '19 at 15:57
  • The crashing line, which is trying to dereference a pointer calculated from rdx + rcx*8 may be crashing because of the TLS retrieval just above. That should have given rdx a value. – Benj Oct 03 '19 at 16:01
  • *"class is designed to run from DllMain"* - it is broken by design. You are also using meyer's singleton which is broken by design as well. – user7860670 Oct 03 '19 at 16:07
  • @VTT Can you elaborate? I wasn't aware there was any prohibition on using classes in dllmain. Also, I'd thought this style of singleton was fine post c++11 ? – Benj Oct 03 '19 at 16:10
  • Relying on implicit initialization / finalization actions during dll loading is a bad design by itself. Performing something reliable from DLL main is a non-trivial task that may require nasty trickery. Meyer's singleton is broken because it implies implicit lazy initialization and lack of control over object lifetime. However it seems to be quite popular and a lot of people whining about singletons being bad are actually just suffering from problems caused by such an implementation.Have you considered moving these checks into some dedicated public API function to be called by library clients? – user7860670 Oct 03 '19 at 16:28
  • @VTT There's numerous ways I could refactor this but since it only occurs rarely on customer sites I'd like to understand why it's breaking before I fix it. As I understand it, the lifetime of this object should be well defined because the CRT guarantees static object destruction when the dll is unloaded. Initialization should also be guaranteed in C++11 although the compiler implementation seems to require TLS which troubles me slightly. I'm not sure why DllMain would make a difference in this case. Statics are well cared for by the CRT. – Benj Oct 03 '19 at 16:38
  • It is wrong to rely on CRT at DllMain on the first place. Dynamic initialization and finalization stages are always a random rollercoaster. Runtime is required to destroy previously initialized objects with static storage duration, but programmer has no control over these action. Since this question lacks [mcve] suggesting something other than "Don't do things in DLLmain" is problematic. – user7860670 Oct 03 '19 at 17:05
  • 1
    Diagnosing a one-in-a-million crash reason is quite difficult. Some background is essential, I *guess* that you are getting minidumps sent to you from failures out in the wild from machines you know little about. You are relying on TLS state, it isn't just yours. Takes but an evil injected DLL or lousy anti-malware to invalidate all assumptions. The debugger's modules list can provide a clue, odds you can anything about it however are extremely low. – Hans Passant Oct 03 '19 at 17:44
  • @HansPassant Yes that's exactly the situation. I have a customer who says they can reliably produce these dumps but we haven't seen them elsewhere. The idea of a compatibility issue is interesting actually. We come across them all the time but it hadn't occurred to me that someone else might be damaging the TLS in some way. That might explain why it's just this one customer. – Benj Oct 03 '19 at 17:47
  • @VTT I can't find much to agree with you on t'interwebs. Microsoft's own best practices guide specifically calls out static initialization as a valid thing to do within Dll initialization. https://learn.microsoft.com/en-us/windows/win32/dlls/dynamic-link-library-best-practices – Benj Oct 03 '19 at 17:48
  • Static initialization actually happens prior to dll main invocation. Initialization of statics mentioned in the linked article aka dynamic initialization indeed is performed from dll main. That's what crt is performing prior to calling user-defined dll main. But relying on dynamic initialization is never a good idea and its introduction into C++ was a bad design decision. Now that I've mentioned it, it is somewhat suspicious that `_DllMainCRTStartup` (default name of real dll main) is not visible in the call stack. ` – user7860670 Oct 03 '19 at 18:40
  • @VTT Interesting. On the point about the entry point, I'm confused. I've seen _DllMainCRTStartup in the past but I've just built a completely default empty DLL project from the VS 2015 wizard and injected into notepad using process hacker. Breaking in DllMain shows an identical callstack to the one in this post. LdrpCallInitRoutine calls dllmain_dispatch which calls DllMain. _DllMainCRTStartup is in the export table though. Suspect it's called by dllmain_dispatch ? – Benj Oct 03 '19 at 19:58
  • @VTT - In the same scenario, breaking in _DllMainCRTStartup reveals that it's also called by LdrpCallInitRoutine but is no longer in the call stack by the time DllMain is called. Think it might be too late to figure out why that is, but I no longer think this call stack is suspicious. – Benj Oct 03 '19 at 20:13
  • of course too late but only now view this. *LdrpCorInitialize* - .NET process. internally it load *mscoree.dll* and it dependencies - *kernel32+kernelbase*. probably you inject dll from driver on *kernel32.dll* load. *LdrGetProcedureAddress* (for *_CorExeMain*) as result - *LdrpRunInitializeRoutines* called. and your *dllmain_dispatch* then thread-safe initialization of statics..used tls (if (Once > _Init_thread_epoch). but *LdrpInitializeTls* yet not called in process (if you look - it called just after *LdrpCorInitialize*) - so not use static objects inside function in such early loaded dll – RbMm Jul 08 '21 at 17:22

0 Answers0