6

I am doing the DLL injection job recently, so I have did some research into it on google. Now I know use CreateRemoteThread is a good way.

The ASLR(Address space layout randomization, since Windows Vista) makes the address of kernel32.dll is random, but this does not affect the whole, because in a session the base address of kernel32.dll in all processes is just the same - until the Operating System reset.

So this code may be safe normally:

void launchAndInject(const char* app, const char* dll)
{
    STARTUPINFOA si = {0};
    si.cb = sizeof(si);
    PROCESS_INFORMATION pi = {0};

    if (CreateProcessA(app, NULL, NULL, NULL, FALSE, CREATE_SUSPENDED, NULL, NULL, &si, &pi))
    {
        LPVOID loadLibrary = GetProcAddress(GetModuleHandleA("kernel32.dll"), "LoadLibraryA");
        if (loadLibrary == NULL) {
            return;
        }
        SIZE_T len = ::strlen(dll) + 1;
        LPVOID addr = VirtualAllocEx(pi.hProcess, NULL, len, MEM_RESERVE|MEM_COMMIT, PAGE_READWRITE);
        if (addr == NULL) {
            return;
        }
        if (!WriteProcessMemory(pi.hProcess, addr, dll, len, NULL)) {
            return;
        }
        HANDLE th = CreateRemoteThread(pi.hProcess, NULL, 0, (LPTHREAD_START_ROUTINE)loadLibrary, addr, 0, NULL);
        WaitForSingleObject(th, INFINITE);
        DWORD ret = 0;
        GetExitCodeThread(th, &ret);
        CloseHandle(th);
        ResumeThread(pi.hThread);
    }
}

The exit code of the injecting thread is just the returned value by LoadLibrary, so ret is just the HMODULE of loaded DLL(in child process of course), it works like a magic, so far so good.

I have read many projects about DLL injection, they use DLLMain to do lots of jobs - like creating thread, or hook APIs, and so on. They must be very carefully to do these things, refers to the document "Best Practices for Creating DLLs" of Microsoft, behaviors such as creating thread might cause dead-lock, "The ideal DllMain would be just an empty stub", so I don't think this is a very good way.

So, get the HMODULE of the loaded DLL is important. With this handle, you can use CreateRemoteThread to call a export function of injected DLL, do whatever you want, no need to worry about the loader-lock things.

Unfortunately, the code above only works with 32bit processes, this is because the type of thread's exit code is DWORD - a 32bit unsigned integer, but HMODULE is a pointer, it can be 64bit. So in 64bit process, you may get a DWORD value 0xeb390000 from GetExitCodeThread, but in fact the HMODULE returned by LoadLibrary is 0x7feeb390000. 0xeb390000 is just a truncated 64bit pointer.

How can we fix this issue ?

amanjiang
  • 1,213
  • 14
  • 33

4 Answers4

3

You could assume that the code as shown is likely to work and that the truncated HMODULE is likely to actually be fine most of the time as modules are usually loaded low enough in the process address space that it doesn't matter. To ensure that the code always works though you could follow your 'broken' example code with a call to EnumProcessModules() function. If the returned HMODULE appears in the list of process modules for the target process then you're good to go. If not, you need to iterate the returned HMODULEs and call GetModuleBaseName() or GetModuleFileNameEx() until you locate your injected DLL.

Alternatively, if you're already running as a custom debugger (which I find useful when I want to inject anyway), then you can match the module that's loaded when you inject up to the corresponding LOAD_DLL_DEBUG_EVENT that will be reported by WaitForDebugEvent(). This would give you the both the HMODULE (base of image) and image file name and the event would occur immediately after you injected your DLL.

Personally I'd take the later approach, but in practice I've yet to see the truncated HMODULE returned from the broken code to be anything but correct, but I expect I've just been lucky and am relying on the loader to load DLLs low in the process's address space.

Len Holgate
  • 21,282
  • 4
  • 45
  • 92
  • 1
    My another question: http://stackoverflow.com/questions/27331014/enumprocessmodulesex-and-createtoolhelp32snapshot-fails-whatever-32bit-or-64bi I have no idea why EnumProcessModules doesn't work. For the debug event, I am worry about the performance. I choosed IPC solution in the end. – amanjiang Dec 08 '14 at 02:30
  • 1
    The answer about why EnumProcessModules doesn't work: http://stackoverflow.com/a/27317947/996540 – amanjiang Dec 08 '14 at 03:50
  • That's useful to know. – Len Holgate Dec 08 '14 at 07:22
  • 1
    On Windows 10 I have yet to see a process where any loaded module has a base address small enough to fit into a `DWORD`, so using one of your two approaches is a must. – melak47 Sep 29 '17 at 17:18
1

For 64bit process, I could use IPC (http://msdn.microsoft.com/en-us/library/windows/desktop/aa365574%28v=vs.85%29.aspx) to get the HMODULE back.

I should note that not every IPC mechanism could works in DLLMain, for example Pipe will cause dead-lock, refer to the document "Best Practices for Creating DLLs" (http://download.microsoft.com/download/a/f/7/af7777e5-7dcd-4800-8a0a-b18336565f5b/DLL_bestprac.doc) of Microsoft, call functions in kernel32.dll (except some specified functions) will be OK.

I have tested shared memory (on Windows XP with SP3 and Windows 7 64bit PRO), it works.

amanjiang
  • 1,213
  • 14
  • 33
1

You can safely use the returned 32-bit handle as according to https://learn.microsoft.com/en-us/windows/desktop/WinProg64/interprocess-communication, 64 bit windows still use 32-bit handle.

64-bit versions of Windows use 32-bit handles for interoperability. When sharing a handle between 32-bit and 64-bit applications, only the lower 32 bits are significant, so it is safe to truncate the handle (when passing it from 64-bit to 32-bit) or sign-extend the handle (when passing it from 32-bit to 64-bit). Handles that can be shared include handles to user objects such as windows (HWND), handles to GDI objects such as pens and brushes (HBRUSH and HPEN), and handles to named objects such as mutexes, semaphores, and file handles.

El Mismo Sol
  • 869
  • 1
  • 6
  • 14
  • It's been a long time, and I already lost the context in my mind. I will consider your answer when I jump back into the complicated injecting thing. Thank you for your answer, I can speak Chinese by the way. – amanjiang Jan 31 '19 at 02:58
  • 2
    This is not true (for this particular case) I'm afraid. As you can see, the copied MS document excerpt does not list `HMODULE` as one of the "sharable" handle types. The return value of `LoadLibrary(A|W)` is the base address of the module (which could very wll use all 64 bits on such a machine), and I just did see in my test DLL injection program that the truncation (`0x7FEF3AA0000 != 0xF3AA0000`) messes up the "handle" value!... – OzgurH May 06 '20 at 20:21
1

A bit more advanced, but rather than having CreateRemoteThread() call LoadLibrary() directly, you can instead allocate an HMODULE in the remote process by using VirtualAllocEx(), then allocate a block of executable memory (also with VirtualAllocEx()) and put some assembly code in it to call LoadLibrary() and save the return value to the allocated HMODULE, and then you can have CreateRemoteThread() run that allocated "function", wait for the thread to exit, and then read the HMODULE using ReadProcessMemory().

This is demonstrated by Implementing Remote LoadLibrary and Remote GetProcAddress Using PowerShell and Assembly (in PowerShell, but can be translated to C/C++ as needed).

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770