How to isolate dependencies against transitive dependency resolution?

Question

I am working on an application which provides a plugin interface for customers to develop their logic inside the app. The plugins are then loaded dynamically at runtime. We provide a clean C interface for plugins to make things as portable as possible. However, we recently discovered a problem with transitional dependencies: When a plugin links against its own dependency, which happens to be a dependency of the app as well, only the version shipped with the app is loaded.

So in the following configuration, lib_b.dll is the plugin, which uses lib_a.dll as a private dependency. Though because the Executable is also linking against a different version of the same library, their version is not chosen.

    +----------------------+              +-------------------------------+
    |                      | LoadLibrary  |                               |
    | Executable.exe ------+--------------+--> plugins                    |
    |  |                   |              |     |                         |
    |  +--> lib_a.dll (v1) |              |     +--> lib_b.dll            |
    |                      |              |           |                   |
    +----------------------+              |           +--> lib_a.dll (v2) |
                                          |                               |
                                          +-------------------------------+

I am looking for a solution to isolate address space and symbols of dependencies from my application. The idea is to make Executable only care about the symbols it loads from plugins in runtime, not what the plugins use internally.

We do load_library like this:

HMODULE h = ::LoadLibraryExA(".../plugins/library_b.dll", 
    NULL, LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR);

And the only point interesting in the plugin is reached like this:

callback_type do_the_thing_in_b = GetProcAddress(h, "do_the_thing_in_b");
int answer = do_the_thing_in_b(3.14, 42);

Updates

The applicaiton can be recompiled or changed at any time, but at the compile time we have no information of the plugins. The idea is that clients can creatte their own plugin and put it there.

We also can not modify the plugins. We can scan that directory and do stuff, but that's the extent of changes we can do. We can't recompile them or decide on their dependency structure.

The plugins link agains our executable via an interface library and call functions directly from the executable

https://learn.microsoft.com/en-us/windows/win32/sbscs/about-isolated-applications-and-side-by-side-assemblies — Hans Passant, Jul 28 '23 at 17:22
@HansPassant That's a solution to a different problem. I need to load two different DLL's into the same application — sorush-r, Jul 30 '23 at 15:20
Would statically linking ``lib_a.dll (v2)`` into ``lib_b.dll`` work for you? This way, there wouldn't need to be two ``lib_a.dll`` files, as ``lib_b.dll`` would have the various functions from ``lib_a.dll (v2)`` directly present into itself — RedStoneMatt, Jul 30 '23 at 23:48
Have you seen this answer https://stackoverflow.com/questions/65359706/using-multiple-versions-of-same-dll — linuxfever, Jul 31 '23 at 05:30
@RedStoneMatt That would work but unfortunately we receive plugins from thirarties and some of those companies don't even exist anymore. Therefore no source code — sorush-r, Jul 31 '23 at 07:01
How about running a service/separate app that loads plugins and then IPC with it? — Daniil Vlasenko, Jul 31 '23 at 08:34
(1/2) @sorush-r Do you happen to know what makes Windows not load ``lib_a.dll (v2)`` when ``lib_a.dll (v1)`` is already loaded? Is it because both files have the same name? Is it because the symbols between the two match? If it's the former, I'd suggest renaming ``lib_a.dll (v2)`` and edit ``lib_b.dll`` to make it refer to the new name using a PE or Hex editor. If it's the latter, I'd suggest changing the symbols that ``lib_b.dll`` imports from ``lib_a.dll (v2)``, and completely strip out the other symbols from ``lib_a.dll (v2)``, suing a PE or Hex editor as well. — RedStoneMatt, Jul 31 '23 at 09:51
(2/2) In both case, it's very hacky and not something I'd recommend doing in the long term at all. But combined with my previous comment about statically linking additional DLLs into your plugin's DLL, then you could do exactly that for future plugins, and forcefully convert the old ones you don't have the source code of using the hacky technique I described. I get that this might be very annoying, but unfortunately it's likely a Windows API issue, which you can't really do much about. — RedStoneMatt, Jul 31 '23 at 09:51
@DaniilVlasenko The plugin code links against our executable, and calls functions... — sorush-r, Jul 31 '23 at 12:58
the idea of RedStoneMatt can be simplified, rename lib_a.dll(v1) and load the renamed lib into your app to check whether this will work or not. Btw, even differences in path to libs should work (if you load v1 dll dynamically ofk) — Daniil Vlasenko, Jul 31 '23 at 14:19
and modifying the plugins so that they also load their dependencies explicitly at runtime with your library loading API targeting the desired file? — Oersted, Aug 01 '23 at 15:31
@sorush-r jan, I think it's better to load the plugins in separate processes or as separate modules within your application, each with its own isolated address space!If you want, please let me know and I will explain it to you with an example. — Freeman, Aug 02 '23 at 13:55
About interface library: If I understand correctly, all plugins are linked against your interface library. Does that library also define complete API for plugins to use while "*calling functions directly from the executable*"? — gordan.sikic, Aug 03 '23 at 13:27
@gordan.sikic No. The plugins link against the executable itself, using an interface library. (The concrete callbacks are in .exe file, the definitions in some .lib file) — sorush-r, Aug 04 '23 at 08:25
Maybe not the answer you're looking for, but you could use COM (https://learn.microsoft.com/en-us/windows/win32/com/component-object-model--com--portal) which is used everywhere in Windows. It's exactly made for this type of purpose. — Simon Mourier, Aug 05 '23 at 16:40
@SimonMourier use com nothing changed for dll dependencies. Load we dll by self and call some export, or use CoCreateInstance - if 2 or more modules try use the same dll, but different versions of it. Even don't know what is this DLLs, can only assume that different modules try use different crt versions — RbMm, Aug 05 '23 at 16:48
@RbMm - obviously it depends how you build and architecture your dlls. It would require some changes to existing ways of doing this, but with COM properly done, you don't need to statically bind to any dll. The plugin (say IPlugin) can be passed some say IHost interface with functions needed for the plugin to do its job. Extensions can be acomplished by IHost2, IHost3, etc. — Simon Mourier, Aug 05 '23 at 17:27
@SimonMourier *you don't need to statically bind to any dll* - there is confusion. It's not about the fact that the application is statically linked to plugins (it is not tied), but that the plugins themselves have some kind of static (or dynamic) import. and changing the interface to COM won't change anything. Each COM module will still have its own dependencies. there can be 2 COM modules and both have import from msvcrt.dll . but from different versions of this dll. and it becomes necessary to load both versions of the dll into the process — RbMm, Aug 05 '23 at 18:01

score 0 · Answer 1 · answered Aug 04 '23 at 21:53

In summary,

You have an application, with a defined interface for plug-ins
You define and are in control of that interface
Plug-ins depend on being able to call some functions that are inside the application
Plug-ins and the application can have conflicting views of which is the proper version of a dependency, and the application's view wins
Plug-ins do not call functions in other plug-ins

I think you've hit the nail on the head with "isolate address space", much as Daniil Vlasenko suggested in the comments. This is also what FireFox is doing now with tabs; each one is a separate process.

If you are in control of the interface, you are in a good position to be able to separate the main executable into one process, and have a thin shim process that can load a single plug-in's DLL (and its dependencies). You'd start one of these shims for each of the plug-ins. The processes will communicate via some sort of in-machine RPC. This is all assuming that data structures are not shared between the executable and plug-ins, other than via function calls and the returns.

What might that look like?

RPC (e.g. GPB-RPC) doesn't feel quite right, because you'd need an RPC going one way for the executable to call the plug-in, but there needs to be an allowance for how the plug-in can in turn call a function in the executable.

I would solve this using something like ZeroMQ as the transport between the main executable and plug-in, and something like GPB to formulate the messages that 1) indicate what function is to be called, and 2) what the parameters for it are. The sequence would go something like:

Per plug-in, have 2 ZMQ REQ/REP socket pairs, one pair from executable to the plug-in shim process (call forward), the other pair running in the opposite direction (call back). Each process ends up with a REQ and REP socket.
Formulate a message in the main executable to be sent to the plug-in shim to call a provided function
Send that message through the call forward REQ socket from the main executable, and then go on to poll the call forward REQ socket and the call back REP socket. This allows the main executable to either receive a return value, or receive a request for one of its own functions to be executed
The shim process receives that on its call forward REP socket, decodes it, and calls the relevant function in the plug-in DLL
The plug-in function may want to call a function in the main executable. You will have to have implemented stub versions of all such functions in the shim process. The stub function in the thin shim formulates a message containing the call parameters, and sends that over its own call back REQ socket back to the main executable, and then blocks on a zmq_recv() on the call back REQ socket.
The main executable, which has in the meantime been polling its call forward REQ and call back REP sockets, gets told by ZMQ that the call back REP socket has a message on it. It reads this message, executes the specified function, gathers the result.
The result is sent back through the call back REP socket, back to the shim process. The main executable returns to polling both sockets.
The shim process, blocked on zmq_recv, receives the results message, and returns the result back to the calling function in the DLL plugin.
The plug-in function itself finally completes, and returns a result to the shim process
This final result is packaged up into a message that is sent back to the main executable via the call forward REP socket in the shim
The main executable this time gets told there's a message ready on its call forward REQ socket - the reply back from the plug-in.
This message is read, and the data returned to whatever it was in the main executable that wanted to call the plug-in.

What this allows is for

The main executable to call a function in a plug-in
For that called function to call functions provided by the main executable any number of times (from 0 to lots)
For the plug-in function result to be returned to the main executable
For the plug-in DLL and main executable to exist in different processes, with their own preferred dependencies loaded.

ZMQ is going to be useful, because it sounds like you've not got a straight forward client/server or RPC relationship between the main executable and plug-ins; they're a bit more interdependent. ZMQ is Actor Model, which allows for this kind of pattern as described above.

A freebie with ZeroMQ would be that, having mastered this, the plug-in could just as easily be on another machine altogether, or a combination (i.e. some local, some remote, some running on Linux on the other side of the world, etc).

Obviously, having two separate processes won't help if there's shared data structures, though I suppose that this could be overcome by placing them in shared memory. But then all plug-in shim processes would have to be on the same machine as the main executable.

If things are a bit more multi-threaded, I don't think the pattern outlined above really changes much. You might want fields in messages to indicate what's going on. It might get a bit more difficult if the plug-ins are manipulating semaphores, mutexes, etc, created by the main executable.

RbMm · Answer 2 · 2023-08-05T13:04:59.500

during load dll windows always call function RtlDosApplyFileIsolationRedirection_Ustr. it exported, so easy can be hooked. with this api we can redirect (replace ) dell name.

so first try hook this api:

#ifdef _X86_

#pragma warning(disable: 4483) // Allow use of __identifier
#define __imp_RtlDosApplyFileIsolationRedirection_Ustr __identifier("_imp__RtlDosApplyFileIsolationRedirection_Ustr@36")
#endif

EXTERN_C extern PVOID __imp_RtlDosApplyFileIsolationRedirection_Ustr;

NTSTATUS
NTAPI
hook_RtlDosApplyFileIsolationRedirection_Ustr(_In_ ULONG Flags,
                                              _In_ PUNICODE_STRING OriginalName,
                                              _In_ PUNICODE_STRING Extension,
                                              _Out_opt_ PUNICODE_STRING StaticString,
                                              _Out_opt_ PUNICODE_STRING DynamicString,
                                              _Out_opt_ PUNICODE_STRING *NewName,
                                              _Out_opt_ PULONG NewFlags,
                                              _Out_opt_ PSIZE_T FilePathLength,
                                              _Out_opt_ PSIZE_T MaxPathSize);

ULONG dwError = DetourTransactionBegin();
if (NOERROR == dwError)
{
    //++ optional
    DetourThread* pti = 0;
    SuspendThreads(&pti);
    //--optional

    dwError = DetourAttach(&__imp_RtlDosApplyFileIsolationRedirection_Ustr,
         hook_RtlDosApplyFileIsolationRedirection_Ustr);

    dwError = NOERROR != dwError ? DetourTransactionAbort() : DetourTransactionCommit();

    //++optional
    Free(pti);
    //--optional
}

implementation of SuspendThreads (optional) can be next:

struct DetourThread
{
    DetourThread *      pNext;
    HANDLE              hThread;
};

void Free(_In_ DetourThread* next)
{
    if (DetourThread* pti = next)
    {
        do 
        {
            next = pti->pNext;

            NtClose(pti->hThread);

            delete pti;

        } while (pti = next);
    }
}

NTSTATUS SuspendThreads(_Out_ DetourThread** ppti)
{
    DetourThread* pti = 0;
    HANDLE ThreadHandle = 0, hThread;
    NTSTATUS status;
    BOOL bClose = FALSE;

    HANDLE UniqueThread = (HANDLE)GetCurrentThreadId();

loop:
    status = NtGetNextThread(NtCurrentProcess(), ThreadHandle, 
        THREAD_QUERY_LIMITED_INFORMATION|THREAD_SUSPEND_RESUME|THREAD_GET_CONTEXT|THREAD_SET_CONTEXT, 
        0, 0, &hThread);

    if (bClose)
    {
        NtClose(ThreadHandle);
        bClose = FALSE;
    }

    if (0 <= status)
    {
        ThreadHandle = hThread;

        THREAD_BASIC_INFORMATION tbi;

        if (0 <= (status = NtQueryInformationThread(hThread, ThreadBasicInformation, &tbi, sizeof(tbi), 0)))
        {
            if (tbi.ClientId.UniqueThread == UniqueThread)
            {
                bClose = TRUE;
                goto loop;
            }

            if (NOERROR == (status = DetourUpdateThread(hThread)))
            {
                status = STATUS_NO_MEMORY;

                if (DetourThread* next = new DetourThread)
                {
                    next->hThread = hThread;
                    next->pNext = pti;
                    pti = next;
                    goto loop;
                }

                ResumeThread(hThread);
            }
        }

        if (status == STATUS_THREAD_IS_TERMINATING)
        {
            bClose = TRUE;
            goto loop;
        }

        NtClose(hThread);
    }

    switch (status)
    {
    case STATUS_NO_MORE_ENTRIES:
    case STATUS_SUCCESS:
        *ppti = pti;
        return STATUS_SUCCESS;
    }

    Free(pti);

    *ppti = 0;
    return status;
}

the DetourUpdateThread suspend and save thread handles in DetourThread list. and resume it in DetourTransactionAbort or DetourTransactionCommit. but it not close saved hThread. as result need by self mantain additional list of threads, for close it handles.. (Free)

ok. let we hook RtlDosApplyFileIsolationRedirection_Ustr. now we need implement hook_RtlDosApplyFileIsolationRedirection_Ustr

let we have next api:

// set path to some plugin (A) folder. 
// pszPluginPath - relative path. like plugin/A/

BOOL SetPluginPath(_In_ PCWSTR pszPluginPath);

// return full path to current plugin (A) folder - some like */plugin/A/

PCWSTR AcquirePluginPath();

void ReleasePluginPath();

// called once on start
BOOL InitPluginPath();

// called once on exit
void FreePluginPath();

enclose load plugin in next code:

    if (SetPluginPath(L"plugins/A/"))
    {
        LoadLibraryW(L"some-plugin.dll");
        RemovePluginPath();
    }

assume that ./plugin/ is folder inside application folder (where exe is located) and it containing subfolders for every plugin ( A, B, .. )

/plugin
   /A
   /B

so with this code we try load ./plugin/A/some-plugin.dll

and if some-plugin.dll have static dependency (or call LoadLibrary inside dll entry point - this is ok really) from lib-Y.dll and exist ./plugin/A/lob-Y.dll file - we try load exactly ./plugin/A/lob-Y.dll. even if ./lib-Y.dll or/and ./plugin/B/lib-Y.dll already loaded in process.

NTSTATUS
NTAPI
hook_RtlDosApplyFileIsolationRedirection_Ustr(_In_ ULONG Flags,
                                              _In_ PUNICODE_STRING OriginalName,
                                              _In_ PUNICODE_STRING Extension,
                                              _Out_opt_ PUNICODE_STRING StaticString,
                                              _Out_opt_ PUNICODE_STRING DynamicString,
                                              _Out_opt_ PUNICODE_STRING *NewName,
                                              _Out_opt_ PULONG NewFlags,
                                              _Out_opt_ PSIZE_T FilePathLength,
                                              _Out_opt_ PSIZE_T MaxPathSize)
{
    if (DynamicString)
    {
        BOOLEAN fOk = FALSE;
        WCHAR lpLibFileName[MAX_PATH], *lpFilePart = 0;

        if (PCWSTR pszPluginPath = AcquirePluginPath())
        {
            if (!wcscpy_s(lpLibFileName, _countof(lpLibFileName), pszPluginPath))
            {
                size_t s = wcslen(lpLibFileName);

                lpFilePart = lpLibFileName + s;

                int len = swprintf_s(lpFilePart, _countof(lpLibFileName) - s, L"%wZ", OriginalName);

                if (0 < len)
                {
                    static const UNICODE_STRING dot = RTL_CONSTANT_STRING(L".");
                    USHORT u;
                    if (0 > RtlFindCharInUnicodeString(0, OriginalName, &dot, &u))
                    {
                        swprintf_s(lpFilePart + len, _countof(lpLibFileName) - s - len, L"%wZ", Extension);
                    }       

                    fOk = RtlDoesFileExists_U(lpLibFileName);
                }
            }
        }

        ReleasePluginPath();

        if (fOk)
        {
            if (RtlCreateUnicodeString(DynamicString, lpLibFileName))
            {
                if (NewName)
                {
                    *NewName = DynamicString;
                }

                if (NewFlags)
                {
                    *NewFlags = 0;
                }

                if (FilePathLength)
                {
                    *FilePathLength = lpFilePart - lpLibFileName;
                }

                if (MaxPathSize)
                {
                    *MaxPathSize = _countof(lpLibFileName);
                }

                return STATUS_SUCCESS;
            }

            return STATUS_NO_MEMORY;
        }
    }

    return RtlDosApplyFileIsolationRedirection_Ustr(Flags,
        OriginalName,
        Extension,
        StaticString,
        DynamicString,
        NewName,
        NewFlags,
        FilePathLength,
        MaxPathSize);
}

and finally - implementation of plugin paths api:

SRWLOCK g_lock = RTL_SRWLOCK_INIT;
ULONG g_cchMaxPlugin = 0;
PWSTR g_pszPluginPath = 0, g_pszPluginRelativePath = 0, g_pszPluginName = 0;

void FreePluginPath()
{
    if (g_pszPluginPath)
    {
        if (g_pszPluginName)
        {
            __debugbreak();
        }

        delete [] g_pszPluginPath;
    }
}

BOOL InitPluginPath()
{
    enum { buf_size = MAXSHORT + 1 };

    if (PWSTR psz = new(nothrow) WCHAR[buf_size])
    {
        if (ULONG cch = GetModuleFileName(0, psz, buf_size))
        {
            if (NOERROR == GetLastError())
            {
                PWSTR FileName = psz + cch;
                g_cchMaxPlugin = buf_size - cch;

                do 
                {
                    switch (*--FileName)
                    {
                    case '\\':
                    case '/':
                        g_pszPluginPath = psz;
                        g_pszPluginRelativePath = FileName + 1;
                        return TRUE;

                    }
                } while (g_cchMaxPlugin++, --cch);
            }
        }
        delete [] psz;
    }

    return FALSE;
}

BOOL SetPluginPath(_In_ PCWSTR pszPluginPath)
{
    SIZE_T cch = wcslen(pszPluginPath);

    PWSTR pszPluginName = g_pszPluginRelativePath + cch;

    if (++cch > g_cchMaxPlugin)
    {
        return FALSE;
    }

    AcquireSRWLockExclusive(&g_lock);

    memcpy(g_pszPluginRelativePath, pszPluginPath, cch * sizeof(WCHAR));

    g_pszPluginName = pszPluginName;

    ReleaseSRWLockExclusive(&g_lock);

    return TRUE;
}

void ReleasePluginPath()
{
    ReleaseSRWLockShared(&g_lock);
}

PCWSTR AcquirePluginPath()
{
    AcquireSRWLockShared(&g_lock);
    return g_pszPluginName ? g_pszPluginPath : 0;
}

void RemovePluginPath()
{
    AcquireSRWLockExclusive(&g_lock);
    g_pszPluginName = 0;
    ReleaseSRWLockExclusive(&g_lock);
}

on start we must call InitPluginPath() and on exit FreePluginPath()

How to isolate dependencies against transitive dependency resolution?

2 Answers2