Is the first thread that gets to run inside a Win32 process the "primary thread"? Need to understand the semantics

Question

I create a process using CreateProcess() with the CREATE_SUSPENDED and then go ahead to create a little patch of code inside the remote process to load a DLL and call a function (exported by that DLL), using VirtualAllocEx() (with ..., MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE), WriteProcessMemory(), then call FlushInstructionCache() on that patch of memory with the code.

After that I call CreateRemoteThread() to invoke that code, creating me a hRemoteThread. I have verified that the remote code works as intended. Note: this code simply returns, it does not call any APIs other than LoadLibrary() and GetProcAddress(), followed by calling the exported stub function that currently simply returns a value that will then get passed on as the exit status of the thread.

Now comes the peculiar observation: remember that the PROCESS_INFORMATION::hThread is still suspended. When I simply ignore hRemoteThread's exit code and also don't wait for it to exit, all goes "fine". The routine that calls CreateRemoteThread() returns and PROCESS_INFORMATION::hThread gets resumed and the (remote) program actually gets to run.

However, if I call WaitForSingleObject(hRemoteThread, INFINITE) or do the following (which has the same effect):

DWORD exitCode = STILL_ACTIVE;
while(STILL_ACTIVE == exitCode)
{
    Sleep(500);
    if(!GetExitCodeThread(hRemoteThread, &exitCode))
        break;
}

followed by CloseHandle() this leads to hRemoteThread finishing before PROCESS_INFORMATION::hThread gets resumed and the process simply "disappears". It is enough to allow hRemoteThread to finish somehow without PROCESS_INFORMATION::hThread to cause the process to die.

This looks suspiciously like a race condition, since under certain circumstances hRemoteThread may still be faster and the process would likely still "disappear", even if I leave the code as is.

Does that imply that the first thread that gets to run within a process becomes automatically the primary thread and that there are special rules for that primary thread?

I was always under the impression that a process finishes when its last thread dies, not when a particular thread dies.

Also note: there is no call to ExitProcess() involved here in any way, because hRemoteThread simply returns and PROCESS_INFORMATION::hThread is still suspended when I wait for hRemoteThread to return.

This happens on Windows XP SP3, 32bit.

Edit: I have just tried Sysinternals Process Monitor to see what's happening and I could verify my observations from before. The injected code does not crash or anything, instead I get to see that if I don't wait for the thread it doesn't exit before I close the program where the code got injected. I'm thinking whether the call to CloseHandle(hRemoteThread) should be postponed or something ...

Edit+1: it's not CloseHandle(). If I leave that out just for a test, the behavior doesn't change when waiting for the thread to finish.

score 1 · Accepted Answer · answered Mar 14 '12 at 03:56

1

The first thread to run isn't special.

For example, create a console app which creates a suspended thread and terminates the original thread (by calling ExitThread). This process never terminates (on Windows 7 anyway).

Or make the new thread wait for five seconds then exit. As expected, the process will live for five seconds and exit when the secondary thread terminates.

I don't know what's happening with your example. The easiest way to avoid the race is to make the new thread resume the original thread.

Speculating now, I do wonder if what you're doing isn't likely to cause problems anyway. For example, what happens to all the DllMain calls for the implicitly loaded DLLs? Are they unexpectedly happening on the wrong thread, are they being skipped, or are they postponed until after your code has run and the main thread starts?

answered Mar 14 '12 at 03:56

arx

16,686
2
44
61

please explain the last point. I **know** it will cause problems to assume that a DLL is loaded at the same base addr. in my process as in the remote process, so an explicit good old `LoadLibrary` passed to `CreateRemoteThread` won't fly - that's why this way. Besides, when the main thread is suspended after creating the process the DLLs have been loaded already, although I am not certain whether their `DllMain` has been called by then. But that shouldn't really matter as long as I don't call my `LoadLibrary` after the main thread gets to start and could be inside a `DllMain` itself. Thx – 0xC0000022L Mar 14 '12 at 04:04
Windows would normally never see a call to `LoadLibrary` before the static DLLs had initialized. This could conceivably cause problems. Alternatively, if the static DllMains are being called on your thread this would confuse DLLs that assume the initialization thread will last for the lifetime of the process, which is usually true. As I said, I'm speculating, but there seems to be plenty of potential for breakage. – arx Mar 14 '12 at 04:15
Hmm, okay. How can I verify whether or not the `DllMain`s of the other DLLs have been called? From past experience I know that I shouldn't even be able to call `CreateRemoteProcess` successfully unless `kernel32.dll` was initialized and a connection to the Win32 established. However, the call to `CreateRemoteProcess` **is** successful here. It's still possible you are right, but I don't think so at the moment. – 0xC0000022L Mar 14 '12 at 04:18
You could test it by creating a simple target process that statically links to a simple DLL. The simple DLL could output the thread ID of the thread its DllMain is called on (using OutputDebugString, say). I'm almost tempted to try this myself, but I need to go to bed. – arx Mar 14 '12 at 04:24
I've played a bit further. You seem to be right that the initialization phase that is somehow needed to keep the process alive has not ended by the time. What I have done for testing is to create the process unsuspended, sleep 500 ms, then suspend it and then inject my code. This becomes really flaky then, because sometimes the process will deadlock completely and sometimes it will start as expected - proving the suspicion about a race condition. Although there is a lot to be found out about it, I'll accept your answer. – 0xC0000022L Mar 14 '12 at 13:20
Depending on the timing you need your injected code to run in, perhaps you could use `WaitForInputIdle()` instead of `Sleep(500)`. – Remy Lebeau Mar 14 '12 at 19:23
@Remy: only saw your comment just now. I'll try that, thanks. – 0xC0000022L Mar 19 '12 at 17:21

score 0 · Answer 2 · answered Mar 14 '12 at 03:20

0

Odds are good that the thread with the main (or equivalent) function calls ExitProcess (either explicitly or in its runtime library). ExitProcess, well, exits the entire process, including killing all threads. Since the main thread doesn't know about your injected code, it doesn't wait for it to finish.

I don't know that there's a good way to make the main thread wait for yours to complete...

answered Mar 14 '12 at 03:20

Mike Caron

14,351
4
49
77

How? How would the thread containing `main()` be able to call `ExitProcess()` if ti doesn't get to run. I wrote in my question that the main thread does not run before the injected code, when I encounter the issue. When they run "in parallel" the issue does **not** occur. – 0xC0000022L Mar 14 '12 at 03:23
Oh, my apologies, I misread your question. That is a puzzler. Your thread terminates cleanly, and doesn't exit the process, but the process still dies? Strange. – Mike Caron Mar 14 '12 at 03:27

Is the first thread that gets to run inside a Win32 process the "primary thread"? Need to understand the semantics

2 Answers2